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Abstract. The use of interpolants in model checking is becoming an enabling technology 
to allow fast and robust verification of hardware and software. The application of encodings 
based on the theory of arrays, however, is limited by the impossibility of deriving quantifier- 
free interpolants in general. 

In this paper, we show that it is possible to obtain quantifier-free interpolants for a 
Skolemized version of the extensional theory of arrays. We prove this in two ways: 

(1) non-constructively, by using the model theoretic notion of amalgamation, which is 
known to be equivalent to admit quantifier-free interpolation for universal theories; 
and 

(2) constructively, by designing an interpolating procedure, based on solving equations 
between array updates. (Interestingly, rewriting techniques are used in the key steps 
of the solver and its proof of correctness.) 

To the best of our knowledge, this is the first successful attempt of computing quantifier- 
free interpolants for a variant of the theory of arrays with extensionality. 
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1. Introduction 

Craig's interpolation theorem [23] applies to first order logic formula and states that when- 
ever the sequent A A B ^ J- is valid, then it is possible to derive a formula / such that 
(i) A ^ I is valid I A B =^ _L is valid, and {in) I is defined over the common 

symbols of A and B\j After the seminal work of McMillan (see, e.g., [H]), Craig's inter- 
polation has become an important technique in verification. Intuitively, the interpolant I 
can be seen as an over-approximation of A with respect to B. This observation is cru- 
cial for several applications of interpolation in verification. For example, the importance 
of computing quantifier-free interpolants (as several symbolic verification procedures rep- 
resent sets of states and transitions as quantifier-free formulae) to over-approximate the 
set of reachable states for model checking has been observed. Unfortunately, Craig's in- 
terpolation theorem does not guarantee that it is always possible to compute quantifier- 
free interpolants. Even worse, for certain first-order theories, it is known that quanti- 
fiers must occur in interpolants of quantifier-free formulae [37]. As a consequence, several 
papers [IIl[2ll[22l[Ml[3ai40l|42l|46l[50l[52l[Ml[^ focused on the efficient computation of 
quantifier-free interpolants for first-order theories which are relevant for verification such 
as uninterpreted functions, (fragments of) Presburger arithmetic, theories of some data- 
structures, and their combination. Despite the ongoing efforts, so far, only the negative 
result in [37] is available for the computation of interpolants in the theory of arrays with 
extensionality, axiomatized by the following three sentences: 

Vy, i, e.rd{wr{y, i, e),i) = e 

yy,i,j,e.i / j ^ rd{wr{y,i,e),j) = rd{y,j) 

\/x,y.x y (3i. rd{x,i) ^ rd{y,i)) 

where rd and wr are the usual operations for reading or updating arrays, respectively. For 
instance, there is no quantifier-free interpolant for the pair of quantifier-free formulae 

A = X = wr{y, i, e) 

B = rd{x, j) / rd{y, j) A rd{x, k) ^ rd{y, k) A j ^ k. 

This theory is important for both hardware and software verification, and a procedure for 
computing quantifier-free interpolants ^^would extend the utility of interpolant extraction as 
a tool in the verifier's toolkit" [H]. Indeed, the endeavour of designing such a procedure 
would be bound to fail (according to [37]) if we restrict ourselves to the original theory. 
To circumvent the problem, we add the (binary) function diff to rd and wr. Intuitively, 
diff(a,6) is an index at which the elements stored in the arrays a and b are different 
(diff (a, 6) is defined arbitrarily in case a and b coincide). Formally, this is characterized 
by Skolemizing the third axiom above (also called the extensionality axiom) to obtain 

Vx, y.x / y rd{x, d±ff{x, y)) ^ rd{y, dif f (x, y)))- 

This axiom is sufficient to ensure that the theory of arrays with diff admits quantifier- free 
interpolants for quantifier-free formulae or, equivalently, that the quantifier- free fragment 
of the theory is closed under interpolation. For example, a quantifier-free interpolant for A 

^To be precise, the original formulation of [2j is slightly different, and it states that whenever A => B is 
valid, then it is possible to derive an I such that A 7 => B are valid, and I is over the common symbols 
of A and B. Clearly, the two formulations are equivalent. 
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and B above is 

I = X = wr{y, dif f (x, y),rd{x, dif f (x, y))). 

Notice how diff permits to represent indexes in the quantifier- free interpolant / by men- 
tioning only the array constants a and b that are common to A and B. As we will see in the 
rest of the paper, this is crucial to compute quantifier-free interpolants. One may wonder 
how useful it is to be able to compute quantifier-free interpolants in the Skolemized variant 
of the theory of arrays with extensionality considered here. The answer lies in the obser- 
vation that this variant is sufficient whenever there is a need to check the unsatisfiability 
of formulae as it is the case of many applications; one of the most important is in model 
checking procedures for infinite state systems (see, e.g., [M]). 

1.1. Contributions. The paper presents two main contributions, that are strictly related 
but completely independent. 

First, we prove non-constructively that given two quantifier- free formulae in the the 
theory of arrays with diff, it is possible to compute a quantifier-free interpolant. We 
do this by using the notion of amalgamation [20^132). Intuitively, a first-order theory has 
the amalgamation property if any two structures in its class of models sharing a common 
sub-model can be regarded as sub-structures of a larger model. A well-known result (see, 
e-g-> H) states that if the class of models of a universal theory T (namely, a theory axiom- 
atized by sentences obtained by prefixing a quantifier-free formula with a block of universal 
quantifiers) have the amalgamation property, then T admits quantifier-free interpolants for 
quantifier- free formulae in the theory and vice versa. Since the theory of arrays with diff is 
universal, we consider the problem of showing that its class of models has the amalgamation 
property. We provide a first, non-constructive, proof of this result by using model-theoretic 
notions only. 

The second contribution of the paper is an algorithm for the generation of quantifier- 
free interpolants from finite sets (intended conjunctively) of literals in the theory of arrays 
with diff. Our algorithm uses as a sub-module a satisfiability procedure for sets of literals 
of the theory. Such a module is based on a sequence of syntactic manipulations organized 
in groups of syntactic transformations. The most important group of transformations is 
a Knuth-Bendix completion procedure (see, e.g., [1]) extended in such a way to solve an 
equation a = wr{b, i, e) for h when this is required by the ordering defined on terms. (We call 
Gaussian completion this extended procedure because of its similarity with the techniques 
to handle Gaussian theories [3].) The goal of these transformations is to produce what we 
call a "modular" constraint for which it is trivial to establish satisfiability. Given two sets 
A and B of literals, the satisfiability procedure is invoked on A and B. While running, 
the two instances of the procedure exchange literals on the common signature of A and B 
(similarly to the Nelson and Oppen combination method, see, e.g., (51]) and perform some 
additional actions. At the end of the computation, the execution trace is examined and 
the desired interpolant is built by simple rules whose goal is to produce a set of literals on 
the common signature of A and B. In fact, the problem during the execution of Gaussian 
completion is to avoid the generation of equalities containing terms built out of non-shared 
symbols. Notice that our approach seems to be quite different from the standard method 
of extracting interpolants from an unsatisfiability proof of A and i? in a given calculus 
(e.g., |lHI46j). Theoretically, it is not difficult to refine our proof of termination to show 
that the proposed algorithm is in NP, which is optimal since the satisfiability problem of 
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quantifier-free formulae in the tlieory of arrays with extensionahty is NP-complete (see, 

e.g., m)- 

1.2. Plan of the paper, hi Section [21 we recah some background notions about theories, 
model-theoretic notions, and rewriting. In Section [3l we define the theory of arrays with 
diff, characterize its models, and show non-constructively that it admits quantifier-free 
interpolation. The rest of the paper is devoted to prove the same result constructively. In 
Section [H we introduce modular constraints (which will be manipulated by the interpola- 
tion procedure) and state (and prove) their key properties. In Section [SJ we describe the 
satisfiability solver for the theory of arrays with diff based on syntactic transformations 
of modular constraints. Then, in Section [H we extend such as solver to produce quantifier- 
free interpolants by using a carefully designed set of meta-rules for interpolation. Finally, 
in Section [TJ we extensively discuss the related work and conclude. 

The appendix contains a proof of the result in [7] to make the paper self-contained. 

2. Formal preliminaries 

We assume the usual syntactic (e.g., signature, variable, term, atom, literal, formula, and 
sentence) and semantic (e.g., structure, truth, satisfiability, and validity) notions of first- 
order logic. The equality symbol "=" is included in all signatures considered below. For 
clarity, we shall use "=" in the meta-theory to express the syntactic identity between two 
symbols or two strings of symbols, or to introduce a new definition. 

2.1. Theories, constraints, interpolants. A theory T is a pair {J^,Axt), where S is 
a signature and Axt is a set of S-sentences, called the axioms of T (we shall sometimes 
write directly T for Axt)- The S-structures in which all sentences from Axt are true 
are the models of T. A universal (resp. existential) sentence is obtained by prefixing a 
string of universal (resp. existential) quantifiers to a quantifier- free formula. A theory T 
is universal iff Axt consists of universal sentences. A S-formula (j) is T-satisfiable if there 
exists a model A4 oi T such that (j) is true in M under a suitable assignment a to the free 
variables of (j) (in symbols, (7V4,a) \= <j)); it is T-valid (in symbols, T \- (p) if its negation 
is T-unsatisfiable or, equivalently, iff ip is provable from the axioms of T in a complete 
calculus for first-order logic. A formula (pi T-entails a formula ip2 if — ^ ^2 is T-valid; the 
notation used for such T-entailment is (fi ^2 or simply ipi 1-^2, if ^ is clear from the 
context. The satisfiability modulo the theory T [SMT{T)) problem amounts to establishing 
the T-satisfiability of quantifier-free S-formulae. 

Let T be a theory in a signature S; a T- constraint (or, simply, a constraint) A is a set of 
ground literals in a signature S' obtained from S by adding a set of free constants. Taking 
conjunction, we can consider a finite constraint A as a single formula; thus, when we say that 
a constraint A is T-satisfiable (or just "satisfiable" if T is clear from the context), we mean 
that the associated formula (also called A) is satisfiable in a S'-structure which is a model of 
T. Let oi, . . . , On be the tuple of free constants occurring in a sentence A and xi, . . . , be 
a tuple of fresh distinct individual variables, the formula A^ is obtained from A by replacing 
each Oi with Xi (for i = 1, n) and then existentially quantifying xi, . . . , i.e. Al^ denotes 
the formula 3xi • • • 3x„A(a;i/ai, . . . , x„,/a„,). We have two notions of equivalence between 
constraints, which are summarized in the next definition. 
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Definition 2.1. Let A and B be finite constraints (or, more generally, first order sentences) 
in an expanded signature. We say that A and B are logically equivalent (modulo T) iff T h 
A -(r^ B; on the other hand, we say that they are 3-equivalent (modulo T) iff T h yl^ -H- B^. 

Logical equivalence means that the constraints have the same semantic content (modulo 
T); 3-equivalence is also useful because we are mainly interested in T-satisfiability of con- 
straints and it is trivial to see that 3-equivalence implies equisatisfiability (again, modulo 
T). As an example, if we take a constraint A, we replace all occurrences of a certain term t 
in it by a fresh constant a and add the equality a = t, called the (explicit) definition (oft), 
the constraint A' we obtain in this way is 3-equivalent to A. As another example, suppose 
that A h-T a = t, that a does not occur in t, and that A' is obtained from A by replacing a 
by t everywhere; then the following four constraints are 3-equivalent 

A, A[j{a = t}, A'u{a = t}, A' 

(the first three are also pairwise logically equivalent). The above examples show how explicit 
definitions can be introduced and removed from constraints while preserving 3-equivalence. 

A theory T is said to admit quantifier-free interpolation (or, equivalently, to have 
quantifier-free interpolants) iff for every pair of quantifier free formulae 4>, ip such that -0 A0 
is not T satisfiable, there exists a quantifier free formula 0, called an interpolant, such that: 
(i) ^l' T-entails 9; (ii) 9 A (j) is not T-satisfiable: (iii) only variables occurring both in ip and 
in (j) occur in 9. 

2.2. Some model theoretic concepts and results. We recall some basic model-theoretic 
notions that will be used in the paper (for more details, the interested reader is pointed to 
standard textbooks in model theory, such as [20]). 

If S is a signature, we use the notation Ai = {M,Z) for a S-structure, meaning that 
M is the support of M and I is the related interpretation function for S-symbols (in a 
many-sorted framework, the support is the disjoint union of the interpretations of the sorts 
symbols of S). 

Roughly, an embedding is a homomorphism that preserves and reflects relations and 
operations. Formally, a T^-emhedding (or, simply, an embedding) between two S-structures 
^A = {M,I) and A/" = {N,J') is any mapping // : M — > N among the corresponding 
support sets satisfying the following three conditions: (a) ;U is a (sort-preserving) injective 
function; (b) /i is an algebraic homomorphism, that is for every n-ary function symbol / 
and for every ai,...,a„ G M, we have /•^(/i(ai), . . . , /x(a„)) = //(/-^(ai, . . . , a„)); (c) ^ 
preserve and reflects interpreted predicates, i.e. for every n-ary predicate symbol P, we 
have (ai, . . . , a„,) G P-^ iff (//(ai), . . . , //(a„)) G P-^. By using simple set-theory, it possible 
to show that every embedding can be factored in an isomorphism and an inclusion. This 
means that if is an embedding from A4 to M, it is possible to assume that — up to an 
isomorphism — A4 is a substructure of Af, in the sense defined below. 

If M C and the embedding /x : A4 — > N is just the identity inclusion M C N, 
we say that Al is a substructure of J\f or that J\f is an superstructure of M. Notice that 
a substructure of M is nothing but a subset of the carrier set of M which is closed under 
the S-operations and whose S-structure is inherited from J\f by restriction. In fact, given 
M = (N, J') and G C N, there exists the smallest substructure of M containing G in 
its carrier set. This is called the substructure generated by G and its carrier set can be 
characterized as the set of the elements b £ N such that t-^{a) = b for some S-term t and 
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some finite tuple a from G (when we write t (a) = b, we mean that {J\f, a) \= t{x) = y for 
an assignment a mapping the a to the x and b to y). An easy — but fundamental — fact is 
that the truth of a universal (resp. existential) sentence is preserved through substructures 
(resp. through superstructures). 

Let M = (M, X) be a E-structure which is generated by G C M. Let us expand T, 
with a set of fresh free constants in such a way that in the expanded signature there is 
a fresh free constant Cg for every g (z G (write Cg directly with g for simplicity). Let be 
the Sc-structure obtained from M by interpreting each Cg as g. The T^G-diagmm 5m{G) 
of A4 is the set of all ground Sc-literals L such A4'^ \= L. When we speak of the diagram 
of M tout court, we mean the S Af-diagram 6m (M) . 

The following celebrated result [20] is simple, but nevertheless very powerful and it will 
be used in the rest of the paper. 

Lemma 2.2 (Robinson Diagram Lemma). Let Ai = (M, X) be a Ti-structure which is 
generated by G CI M and N = {N,J') be another Ti-structure. Then, there is a bijective 
correspondence between Ti-embeddings fi : M — > Af and Tiq- expansions 
of Af such that J\f^^^ \= 5m{G)- The correspondence associates with ^ the extension of J 
to Sg given by J^^\cg) = fi{g). 

Notice that an embedding fj, : A4 — > N is uniquely determined, in case it exists, by 
the image of the set of generators G: this is because the fact that G generates M implies 
(and is equivalent to) the fact that every c G M is of the kind t^{g), for some term t and 
some g from G. 

Intuitively, amalgamation is a property of collections of structures that guarantees that 
two structures in the collection can be glued into substructures of a larger one. Formally, a 
theory T is said to have the amalgamation property iff whenever we are given embeddings 

— > Ml, fi2-Af — > M2 

among the models M,Mi,A42 of T, then there exists a further model Ai of T endowed 
with embeddings 

ui : Ml — > M, V2 ■ M2 — > M 

such that 1^1 o ^1 = ;/2 o Notice that, up to isomorphism, we can limit ourselves in the 
above definition to the case in which /ii, /U2 are inclusions, i.e. to the case in which M is just 
a substructure of both Mi,M2', in this case, M is said to be a T-amalgam of A^i and M2 
over J\f. (When the signature does not have ground terms of some sort, models J\f having 
empty domain(s) must be included in the definition of amalgamation property.) 

Theorem 2.3 ( [7j). Let T be universal; then T admits quantifier free interpolants iff T 
has the amalgamation property. 

We emphasize that the hypothesis for T to be universal is necessary for the above result 
to hold. To make the paper self-contained, we include the proof of this result in Appendix [Xl 

2.3. Some term rewriting concepts and results. We shall need basic term rewriting 
system notions and results (see, e.g., [1]). In the following, we recall some of the most 
important ones for this paper. 

The reflexive and transitive closure of a binary relation — > is denoted with — )•* and 
its transitive closure by —)■'''. A binary relation over a set E is terminating if there 
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are no infinite sequence eo,ei,... of elements of E such that (ei,ej+i) €^>, also written as 
Cj Cj+i, for every i > 0. The relation E x E is confluent if there exists v ^ E such 
that s V and t v whenever u s and u t, for s,t,u G E. The relation — )■ is 
convergent if it is both terminating and confluent. 

A rewrite rule is an ordered pair of terms / and r, written as Z — > r (intuitively, the 
rule is used to replace instances of / with instances of r)ll A (term-) rewriting system is a 
set R of rewrite rules, which induces a rewrite relation — (or simply when R is clear 
from the context) on terms as follows: —^r is the relation that contains the pairs of terms 
(t, t') such that (for some Z — ?• r in R) the term t has a sub-term of the form la for some 
substitution a (in symbols t = t[la]), and t' is obtained by replacing that subterm la by 
ra in t (in symbols t' = t[ra]). Let s and t terms; we say that s and t are joinable w.r.t. 
a rewrite relation (in symbols, s J, i) when there exists a term u such that s — >■* u and 
t ^* n. A term t is reducible w.r.t. a rewrite relation — ?• if there exists a term u such that 
t ^ u; otherwise, t is irreducible. A term u is a normal form of t w.r.t. a rewrite relation — >■ 
if t u and u is irreducible. A rewrite relation is ground convergent when it is convergent 
once restricted to the set of ground terms. Convergent rewrite relations are interesting 
because they have unique normal forms. KnuthBendix completion is a procedure, based on 
superposition of critical pairs, for transforming a rewrite system into a confluent one (see, 
e.g., [1] for details). Termination of rewrite systems is undecidable. 

A quasi- ordering is a reflexive and transitive relation. The lexicographic path ordering 
>- on a set of terms induced by a quasi-ordering >, called precedence relation, on the 
set of constant and function symbols on which the terms are built is defined as follows: 

S = f{si, ...,Sm)>- 9{tl, ...,tn)=tiS 

(1) Sk y t or Sk = t for some k € {1, . . . , m}, or 

(2) f > g and s >~ ti for each / G {1, . . . , n}, or 

(3) f = g, si= ti, Sj-i = tj-i, Sj >- tj, s y tj+i, sy tn for some j £ {I, . . . ,n}. 
If the precedence relation > is also total, then so is y once restricted to ground terms. 

3. Theories of Arrays and Quantifier-free Interpolation 

The McCarthy theory of arrays AX [43] has three sorts ARRAY, ELEM, INDEX (called "ar- 
ray", "element", and "index" sort, respectively) and two function symbols rd and wr of 
appropriate arities; its axioms are: 

yy,i,e. rd{wr{y,i, e),i) = e (3-1) 

Vy,i,J,e. i j ^ rd{wr{y,i,e)J) = rd{y,j). (3.2) 

The theory of arrays with extensionality AX^^^ has the further axiom 

Vx,y.x y =^ (3i. rd{x,i) ^ rd{y,i)), 

called the 'extensionality' axiom. In this paper, we consider a variant of the McCarthy 
theory of arrays with extensionality, obtained by Skolemizing the axioms of extensional- 
ity. Formally, we define the theory of arrays with dif f AX^^ff by adding the additional 
(Skolem) function dif f to the signature of AX am and replace the extensionality axiom by 
its Skolemization, namely 

Vx,y. X y ^ rd{x,diff{x,y)) ^ rd{y,d±ff{x,y)). (3-3) 
To avoid pathological cases, it is assumed that all variables occurring in r occur also in /. 
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The new symbol diff is binary and takes two arguments of sort ARRAY and returns an 
element of sort INDEX. The new axiom (j3.3p constrains diff to return an index at which 
the two arrays in input store different values, whereas it returns an arbitrary value when 
input arrays are equal. 

3.1. A semantic argument for quantifier-free interpolation. Here, we show that 
-^-^dif f does admit quantifier-free interpolation, contrary to AXe^t [Slj . We do so by using 
a model-theoretic argument based on the equivalence between amalgamation of the models 
and admitting quantifier-free interpolation for universal theories (recall Theorem 12.31 in 
Section [2.2|) . Notice that AX^±ff is universal whereas ^A'ext is not. 

Since amalgamation is a property of the models of a theory, we preliminarily discuss 
the class of models of AX^^ff. A model of AXc^t or ^<^diff is standard when ARRAY is 
interpreted as the set of all functions from indexes to elements. In a standard model of 
^A'cxt or ^Afdiff , arrays are interpreted as functions, rd as function application, and wr 
as the point-wise update operation (i.e. the interpretation of wr{a, i, e) returns the same 
values of the interpretation of a, except at the interpretation of index i where it returns 
the interpretation of e). Indeed, the class of models of ^^cxt or ^<^diff contains also non- 
standard models. This is because the axioms of both ^A'cxt and ^-^diff ) being first-order 
formulae, do not constrain the interpretation of the sort ARRAY to contain all mappings from 
indexes to elements. (This is similar to the interpretation of function variables according 
to the Henkin semantics of second order logic; see, e.g., [27].) Fortunately, because of 
the extensionality axiom, it is easy to show (see below) that every model of such theories 
embeds into a standard one (recall the definition of embedding in Section [2. 2p . This means 
that any model is isomorphic to a sub-structure of a standard model in which arrays are 
interpreted as functions, although it might happen that not all functions are part of the 
interpretation of ARRAY in the model. As a consequence, whenever we want to test the 
validity of universal formulae or the satisfiability of constraints, we can — w.l.o.g. — consider 
only standard models. (This fact will be used in the proofs of some results in later sections, 
such as the proof of Lemma 14.31 where a standard model is built to show the satisfiability 
of a certain class of constraints of ^<-fdiff .) 

We show that the universal theory AX^^ff has the amalgamation properties so that, by 
Theorem 12.31 we are entitled to conclude that it admits quantifier- free interpolation. Recall 
from Section [2.21 that a universal theory has the amalgamation property if two of its models 
can be glued as substructures of a third model. Thus, we need to consider arbitrary models 
of ^-^diffi not only the standard ones. This is why we need more insight into arbitrary 
models of our theories and their relationship to standard ones. 

Let us choose an arbitrary model M of AX^xt- We can build the standard model 
std{M) such that INDEX''*'^^-^) = INDEX^ and ELEM'**'^(-^) = ELEM-^. To embed M into 
std{M.) is sufficient to associate with every a € ARRAY-'^ the function mapping i to rd^{a, i) 
(this is an embedding because of the extensionality axiom). In this way, we can identify 
ARRAY^ with a subset of the set of all functions ARRAY'**'^(-^). If we call functional a model 
M in which ARRAY-^ is a subset of the set of functions from INDEX-^ to ELEM-^ (and in 
which rd-^ ^wr^ have the standard meaning), we have just shown that every model is 
isomorphic to a functional one. (The argument extends to models of AX^^^f although — in 
a standard model — the interpretation of diff is not fixed as the interpretations of rd and 
wr.) In this respect, the crucial question is the following: which subsets of the set ARRAY-^ 
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in a standard model Ai can be in the support ARRAY"^ of a functional model A4 (with 
INDEX-^ = INDEX-^, ELEM"^ = ELEM-^) that is a substructure of A^? We shall answer the 
question by using the notion of "closure under cardinality dependence," that we formally 
define next. 

Let a,b be elements of ARRAY-^ in a model A4 of AXd±ff. We say that a and b are 
cardinality dependent (in symbols, M. \= \a — b\ < u;) iS {i INDEX-^ | ^A \= rd{a,i) ^ 
rd{b,i)} is finite. Cardinality dependency is obviously an equivalence relation. 

Lemma 3.1. Let J\f, M he models of AXdi±± such that M. is a substructure of J\f. For 
every a,b ^ ARRAY-^, we have that 

M\=\a-b\<uj iff Af \=\a-b\ <uj. 

Proof. The right-to-left side is trivial because ii M \= \a — b\ < uj then M\= a = wr{b, I, E), 
where / = zi, . . . , z„ is a list of terms of sort INDEX, E = ei, . . . , e„ is a list of terms of 
sort ELEM, and vur(b, I, E) abbreviates the term wr(wr{- ■ ■ wr{a, ii, ei) • • • ),in, ^n) (this and 
similar notations will be discussed in more details in Section [3]). Thus, also M \= a = 
wr{b, I, E) because is a substructure of M. Vice versa, suppose that M \a — b\ < u. 
This means that there are infinitely many i G INDEX-'^ such that rd-^{a,i) ^ rd^{b,i). 
Since A4 is a substructure of J\f, there are also infinitely many i € INDEX-^ such that 
rd^{a, i) / rd^{b, i), i.e. M^\a-b\<uj. □ 

We are now in the position to show how any functional model Ai of AXahi (i-e. up 
to isomorphism, any model whatsoever) can be obtained from a standard one. In order to 
produce any such A^, it is sufficient to take a standard model M, to let INDEX^ = INDEX-'^, 
ELEM-^ = ELEM-^, and to let ARRAY-^ to be equal to any subset _of ARRAY-^ that is closed 
under cardinality dependence, i.e. such that if a E ARRAY-^ and Ai \= \a — h\ < uj, then h is 
also in ARRAY-^. In other words, functional substructures M oi M with INDEX-^ = INDEX-^ 
and ELEM^ = ELEM-^ are in bijective correspondence with subsets of ARRAY-^ closed under 
cardinality dependence. 

A similar remark holds for embeddings. Suppose that ^ : M — > Ad is an embedding 
that restricts to an inclusion INDEX-^ C INDEX-^, ELEM-^ C ELEM-^ for AA and Af func- 
tional models of AXaih- The action of the embedding /i on ARRAY-^ can be characterized 
as follows: take an element a for each cardinality dependence equivalence class, extend 
arbitrarily a to the set INDEX-^ \ INDEX-^ to produce ^i{a) and then define //(5) for non 
representative b in the only possible way for wr to be preserved; i.e. \l Af \=b = wr{a, I, E) 
for a representative a, let /i(6) be vur^ {fi{a) , I , E) . 

By using the observation above, we are ready to show that AX^^ff has the amalgama- 
tion property. 

Theorem 3.2. The theory AX^^ff has the amalgamation property. 

Proof. Take two embeddings /io : Af — > AAq and /ii : J\f — > AAi. As observed above, 
we can suppose — w.l.o.g. — that J\f, AAq, AAi are functional models, that //Oi/^i restricts to 
inclusions for the sorts INDEX and ELEM, and that (ELEM-^o \ELEM-^)n(ELEM-^i \ELEM-^) = 0, 
(INDEX-^o\lNDEX-^)n(lNDEX^i \INDEX-^) = 0. To simplify our task, we can also suppose- 
again w.l.o.g. — that there exists some Cj G (ELEM-^' \ ELEM-^) and some ji G (INDEX-^' \ 
INDEX-'^) (i.e. that these sets are not empty), for i = 0,1- (If this additional condition is 
not satisfied, it is sufficient to enlarge A4i,7W2 so that they satisfy it.) The amalgamated 
model A4 will be the standard model over INDEX-^o U INDEX-^i and ELEM-^o UELEM-^i. We 
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need to define Vi : A^i — > (i = 0, 1) in such a way that uq o = ui o ^i. The only 
relevant point is the action of vi on ARRAY^^: as observed above, in order to define it, it is 
sufficient to extend any a G ARRAY-^' to the indexes k € (iNDEX-^i-' \ INDEX-^): 

(I) we let the value Vi[a){k) be Cj in case there is no c such that M.i \= \a — ^i{c)\ < oj; 
(II) otherwise, we can do the following: take any such c such that TW, \= \a — ^i{c)\ < uj 
and put i'i{a){k) = ^i„j(c)(/c). 
Because of Lemma 13.11 the choice of c in (II) above is immaterial. In fact, any other c' 
differs from c only w.r.t. a finite set of indices in A4i. This also holds in M (by Lemma [3. ip 
and thus we have M \= c' = wr{c, I, E) for some / C INDEX-^. The latter implies that 
/ii„j(c) and /ii_j(c') cannot differ at any k G (ELEM-'^^"^ \ ELEM-'^). This guarantees that 
!/l o = z^2 o fJ'2- 

In order to define dif f ■'^ we can simply extend dif f U dif f in such a way that 
axiom [331 holds. More precisely, we define diff-'^(a, 6) as follows: (i) if for some i = 0, 1, 
we have that a = Ui{a') and b = Vi{h'), then diff-'^(a,6) is taken to be dif f-'^'(a', 6'); (ii) 
otherwise it is defined to be any i such that a{i) ^ h{i) (it is arbitrary whenever a = b). 
For this definition of dif f-'^ to be correct, it is sufficient to show that 
Claim: if a = 2^0(00 ) = ^^1(^1), then there exists c such that ao = /io(c) and ai = /^i(c). 
To prove the claim, suppose that a = 1^0 (^^o) = 1^1 (oi)- Then 1^0(0^0) and z^i(ai) must have 
been defined as in (II) above (otherwise they cannot coincide with each other at indexes 
ioi ji)ll which means that there exists q such that for i = 0, 1 we have Mi \= \ai — ^j(cj)| < 
UJ. Since i'o(ao) = a = 1^1(01), this means that z^o(/^o(co)) = i^iifJ-iico)) and a differ only at 
finitely many indexes; the same is true for 1/1(^1 (ci)) and a, which in turns implies that 
z^i(/.fi(co)) and z/i(//i(ci)) differ only at finitely many indexes too. The same consequently 
holds for co,ci in Af too, for //o(co) and /io(ci) in A4o and for /ii(co) and ^i(ci) in A4i. 
Thus, since the choice of c in (II) is immaterial, we can suppose — w.l.o.g. — that cq = ci 
(let us use just c to name it). Then, by (II) applied to the definition of i'i{ai), we have 
that uoifJ-oic)) = z/i(//i(c)) and a = z^i(ai) cannot differ at any k G (ELEM-^o \ ELEM-^). 
Similarly, z^o(a'o(c)) = i^i(/ii(c)) and a cannot differ at any k € (ELEM^^^ \ ELEM''^). Thus 
a and z/o(/io(c)) = z^i(/Ui(c)) possibly differ only for k G INDEX-^ and actually only for 
finitely many such k. But a = i^Q{aQ) = z^i(ai), so the values of o at any k G INDEX-'^ 
belongs ELEM-^o HELEM^i = ELEM-^, which means that a is equal to wr^{i^o{fio{c)), I , E) = 
VQ{^Q{wr^ {cj , E))) for / C INDEX^ and E C ELEM-^. In conclusion, we have that a is 
of the kind i^o(/"o(c)) = i^i(/Ui(c)) and from a = fo(O'o) = '^i(ai)) we get cq = Ato(c) and 
ai = /Ui(c) because z/o,z/i are injective. □ 

Before stating the main result of the paper which immediately follows from Theo- 
rems 12.31 and 13.21 it is interesting to observe the following about the Claim used in the proof 
of Theorem 13.21 The property mentioned in the Claim is known as strong amalgamability 
property in Universal Algebra and is key to derive quantifier-free interpolation in combi- 
nation of theories |19j . The fact that AX^m enjoys strong amalgamability is crucial to 
transfer quantifier-free interpolation to combinations of ^^diff with other important theo- 
ries, like equality with uninterpreted symbols, difference logic, real arithmetic, appropriate 
variants of integer linear arithmetic, etc. We refer the reader to |19] for details. 

Theorem 3.3. The theory AXahi admits quantifier- free interpolation. 

^The Claim might be false in case INDEX-^i = INDEX-^ = INDEX-^^ this is the reason why we enlarged 
INDEX^i , by adding the extra indexes jo, ji. 
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We conclude this section with some observations concerning the theories AXext and AX. 
Lemma [3 . 1 1 holds also for the theory AX^xt and the proof of Theorem 13.21 goes through also 
for AXext- However, according to Theorem 12.31 in Section [2.21 amalgamation alone is not 
sufficient for establishing quantifier-free interpolation for theories like ^<-fcxt which are not 
universal (for non universal theories one needs sub-amalgamability, not just amalgamability, 
see |19]). Indeed, AXc^t is amalgamable but does not admit quantifier- free interpolation. 

Despite being universal, AX is not amalgamable and thus it does not admit quantifier- 
free interpolation. Indeed, the left-to-right implication of Lemma 13.11 does not hold for 
AX as the arguments in the proof of Theorem 13.21 To get a formal counterexample to 
the amalgamability of AX, consider the following situation. Let Af be the ^Af-model in 
which ELEM-^ and INDEX-^ are empty and ARRAY-^ contains two distinct elements, say a 
and b. As already observed, empty supports must be taken into account when showing the 
amalgamation property and, for AX, the axiom of extensionality needs not be satisfied. 
Extend J\f to two standard models Mi and M2, where ELEM-^i = {e, e'}, INDEX-'^i = {i} 
and ELEM-^2 ={^1,^2}, INDEX-^^ = {ji, ja}. Then, embed TV into Mi by letting a, b differ 
at i (thus, e.g.. Mi |= a = wr{b,i,e) A rd{b,i) = e') and embed M into M2 by letting a,b 
differ at both ji and j2. Now, observe that amalgamation fails because we should have 

M\= a = wr{b,i,e) Ard{a,ji) ^ rd{b,ji) A r(i(a, J2) / rd{b,j2) A ji 7^ j 2 

in any amalgamated model M and this is in contradiction with the two axioms of AX. 

4. Modular constraints for Arrays with diff and their combinations 

Theorem 13.31 is proved by semantic arguments, hence it does not give an interpolation al- 
gorithm; it only guarantees that, by enumerating quantifier free formulae, one can find 
sooner or later the desired interpolant. In the rest of the paper, we develop (indepen- 
dently of the results of Section [3]) techniques based on rewriting and constraint solving to 
construct an algorithm computing quantifier- free interpolants for conjunctions of ground lit- 
erals in AX^iff. Here, we introduce the notion of "modular constraint," which is the main 
data structure manipulated by the quantifier-free interpolation procedure and we prove two 
key properties. First, we show that the satisfiability of modular constraints can be easily 
detected (Lemma 14. 3p . Second, we prove that they can be combined in a modular way 
(Proposition 14. Sp . 

Preliminarily, we introduce some notational conventions which are specific for con- 
straints in the theory AX^itf. We use a,b,... to denote free constants of sort ARRAY, 

. . . for free constants of sort INDEX, and d,e, . . . for free constants of sort ELEM; a, (3, . . . 
stand for free constants of any sort. Below, we shall introduce non-ground rewriting rules 
involving (universally quantified) variables of sort ARRAY: for these variables, we shall use 
the symbols x,y,z,.... We make use of the following abbreviations. 

- [Nested write terms] By wr{a, I, E) we indicate a nested write on the array variable 
a, where indexes are represented by the free constants list / = «i, . . . ,i„ and elements 
by the free constants list E = ei,...,en', more precisely, wr{a, I , E) abbreviates the 
term wr{wr{- ■ ■ wr{a, ii,ei) ■ ■ ■ ),in, ^n)- Notice that, whenever the notation wr{a, I, E) 
is used, the lists / and E must have the same length; for empty /, E, the term wr{a, I, E) 
conventionally stands for a. 
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Refl 


wr{a, I,E) = a -ir^ rd{a, I) = E 
Proviso: Distinct (I) 


Symm 


{wr{a, I,E) = bA rd{a, I) = D) ^ {wr{b, I,D) = aA rd{b, I) = E) 
Proviso: Distinct{I) 


Trans 


[a = wr{b, I,E) Ab = wr{c, J, D)) o (a = wr{c, J ■ I,D ■ E) Ab = wr{c, J, D)) 


Confl 


b = wr{a, I ■J,E-D)Ab = wr{a, I ■ H,E' ■ F) ^ 

O (6 = wr{a, I,E)AE = E' A rd{a, J) = D A rd{a, H) = F) 

Proviso: Distinct{I ■ J ■ H) 


Red 


(a = wr{b, /, E) A rd{b, ik) = e^) -f-)- (a = wr{b, I — k,E — k) A rd{b, i^) = e^) 
Proviso: Distinct{I) 



Legenda: a and b are constants of sort ARRAY; / = ii,...,in, J = ji,---,jm and 
H = hi, . . . ,hi are lists of constants of sort INDEX; E = ei, . . . , e„, E' = e'l, . . . , e'^, 
D = di, . . . , djn, and F = fi, . . . , fi are lists of constants of sort ELEM. 



Figure 1: Key properties of write terms 

- [Multiple read literals] Let a be a constant of sort ARRAY, I = ii, . . . ,in and E = ei, . . . ,en 
be lists of free constants of sort INDEX and ELEM, respectively; rd{a, I) = E abbreviates 
the formula rd(a, ii) = ei A • • • A rd{a, in) = e„. 

- [Multiple equalities] If L = ai , . . . , a„ and L' = a'^ , . . . , are lists of constants of the 
same sort, by L = L' we indicate the formula AiLi '^i ~ ^i- 

- [Multiple distinctions] If L = ai,...,a„, is a list of constants of the same sort, by 
Distinct{L) we abbreviate the formula /\^^jOii ^ aj. 

- [Juxtaposition and subtraction] If L = ai, . . . , a„ and L' = a'^, . . . , are lists of con- 
stants, by L • L' we indicate the list ai, . . . , a„, a'^, . . . , a^; for 1 < k < n, the list L — k 
is the list ai, . . . , ak~i,cek+i, ■ ■ ■ , ctn- 

Some key properties of equalities involving write terms are stated in the following lemma 
(see also Figured]). 

Lemma 4.1 (Key properties of write terms). The formulae in Figure{l\ are all ^<^diff -valid 
under the assumption that their provisoes - if any - hold (when we say that a formula (p is 
AX iiff -valid under the proviso vr, we just mean that vr ^AXufi <P)- 

Proof. The properties in Figure [1] are all straightforward to derive. Here, we just sketch the 
proof of Transitivity, as an example: one side is by replacement of equals; for the-right-to- 
left side, notice that the equalities a = wr{c, J ■ I,D ■ E) and b = wr{c, J,D) can be used 
as rewrite rules to rewrite both members of a = wr{b, I, E) to the same term. □ 

4.1. Modular constraints in AXditf- A (ground) fiat literal is a literal of the form 
a = wr{b, /, E),rd{a, i) = e, dif f (a, b) = i,a = I3,a ^ /3. Notice that replacing a sub-term 
t with a fresh constant a in a constraint A and adding the corresponding defining equation 
a = t to A always produces an 3-equivalent constraint; by repeatedly applying this method, 
one can show that every constraint is 3-equivalent to a fiat constraint, i.e., to one containing 
only flat literals. We split a fiat constraint A into two parts, the index part Aj and the 
main part Am: Aj contains the literals of the form i = j,i ^ j, dif f (a, b) = i, whereas Am 
contains the remaining literals, i.e., those of the form a = wr{b, I, E), a ^ b, rd{a, i) = e, e = 
d,e ^ d (atoms a = b are identifled with literals a = wr{b, 0, 0)). We write A =< Aj,Am > 
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to indicate the two parts of the constraint A. In the main part of a constraint, positive 
Hterals will be treated as rewrite rules; to get a suitable orientation, we use a lexicographic 
path ordering with a total precedence > such that a > wr > rd > diff > i > e, for all 
a, i, e of the corresponding sorts. This choice orients equalities a = wr{b, /, E) from left to 
right when a > b; equalities like a = wr{h, I,E) ioic a < b or a = b will be called badly 
orientable equalities. 

Definition 4.2. A constraint A =< Aj,Am > is said to be modular iff it is flat and the 
following conditions are satisfied (we let /, E be the sets of free constants of sort INDEX and 
ELEM occurring in A): 

(0) no positive index literal i = j occurs in Aj; 

(1) no negative array literal a ^ b occurs in A^j; 

(ii) Am does not contain badly orientable equalities; 

(iii) the rewriting system Ar given by the oriented positive literals of Am joined with the 
rewriting rules 

rd{wr{x,i,e),j) -^rd{x,j) for i,j £ I, e £ E, i ^ j (4.1) 

rd{wr{x,i,e),i) e for i £ I, e £ E (4.2) 

wr{wr{x, i, e),j, d) — ?> wr{wr{x,j, d),i, e) for i,j £ I, e,d £ E, i > j (4.3) 
wr{wr{x,i,e),i,d) ^ wr{x,i,d). for i£ I, e,d£E (4.4) 

is confluent and ground irreducible 

(iv) if a = wr{b, I, E) £ Am and i, e are in the same position in the lists /, E, respectively, 
then rd{b, i) e; 

(v) {diff (a, b) = i, diff (a', b') = i'} C Aj and a ].Ar o! and b J,^^ b' imply i = i'; 

(vi) diff (a, 6) = i £ Af and rd{a,i) rd{b,i) imply a Iar b. 

Condition (o) means that the index constants occurring in a modular constraint are 
implicitly assumed to denote distinct objects. This is supported also by the statement of 
Lemma [4.3l below. from which, it is evident that the addition of all the negative literals i ^ j 
(for i,j £ I, with i ^ j) does not compromise the satisfiability of a modular constraint, 
precisely because such negative literals are implicitly (already) part of the constraint. In 
Condition (i), negative array literals a ^ b are not allowed because they can be replaced by 
suitable literals involving fresh constants and the diff operation (see axiom (13. 3p ). Rules 
(j4.ip and (|4.2p mentioned in condition (iii) reduce read-over-writes and rules (j4.3p and 
(j4.4p sort indexes in flat terms wr{a, I, E) in ascending order. In addition, condition (iv) 
prevents further redundancies in our rules. Finally, conditions (v) and (vi) deal with diff. 
In particular, (v) says that diff is "well defined" and (vi) is a "conditional" translation of 
the contraposition of axiom (|3.3p . 

The non-ground rules from Definition 14.2^ 111) form a convergent rewrite system (critical 
pairs are confluent): this can be checked manually (and can be confirmed also by tools like 



'^The latter means that no rule can be used to reduce the left-hand or the right-hand side of another 
ground rule. Notice that ground rules from Ar are precisely the rules obtained by orienting an equality 
from Am (rules (|4.ip - (|4.4|) are not ground as they contain one variable, namely the array variable x). 
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SPASS or MAUDE). Ground rules from An are of the form 



a — )• wr{b, I, E), 
rd{a, i) — > e, 
d. 



(4.5) 
(4.6) 
(4.7) 



Only rules of the form ()4.7p can overlap with the non-ground rules ()4.ip - ()4.4p . but the 
resulting critical pairs are trivially confluent. Thus, in order to check confluence of Am, only 
overlaps between ground rules ()4.5p - ()4.7p need to he considered (this is the main advantage 
of our choice to orient equalities a = wr{b, I , E) from left to right instead of right to left). 

Lemma 4.3. Suppose that A is modular. Then A is AX d±f± -satis fiable iff there is no 
element inequality e ^ d in Am such that e d. Moreover, A is AX^iff -satisfiable iff 



(varying a, f3 among the different pairs of element and array constants in normal form 
occurring in A) is AX^±ff -satisfiable. 

Proof. Clearly, the satisfiability of A implies that for no negative index literal e ^ d from 
Am, we have that e d. Assume conversely that this is the case: our aim is to build a 
model for Au {a P}a,(S U {i 7^ (varying a, /3 and i,j as indicated in the statement 
of the Lemma). We can freely make the following further assumption: if a,i occur in A 
and a is in normal form, there is some e such that rd{a, i) = e belongs to A (in fact, if this 
does not hold, it is sufficient to add a further equality rd{a, i) = e - with fresh e - without 
destroying the modular property of the constraint). 

Let I* be the set of constants of sort INDEX occurring in A and let E* be the set of 
constants of sort ELEM in normal form occurring in A (we have I* = I and E* ^ E). Finally, 
we let X be the set of free constants of sort ARRAY occurring in A which are in normal form. 

We build a model M. as follows (the symbol + denotes disjoint union): 

• INDEX-^ = I* + {*}; 

• ELEM-^ =E*+X; 

• ARRAY-^ is the set of total functions from INDEX^ to ELEM-^, rd"^ and wr^ are the 
standard read and write operations (i.e. rd-^ is function application and wr"^ is the 
operation of modifying the first argument function by giving it the third argument as a 
value for the second argument input )0 

• for a constant i of sort INDEX, i-^ = i for all i (z I*; 

• for a constant e of sort ELEM, e-^ is the normal form of e; 

• for a constant a of sort ARRAY in normal form and i ^ I*, we put a-^(i) to be equal to 
the normal form of rd{a,i) (this is some e € ELEM-^ by our further assumption above); 
we also put o-^(*) = a (notice that ELEM-^ = E* +X, hence a G ELEM-^). 

• for a constant a of sort ARRAY not in normal form, let wr{c, I, E) be the normal form of a: 
we let to be equal to vur^ {c^ , , E^) (This definition is correct because a and c 
cannot coincide; in fact, since a < wr{a, I, E), the term wr{a, I, E) cannot be the normal 
form of a.) 

• we shall define dif f-'^ later on. 

^In the terminology used in Section [3. II this means that M \s & standard model. 



A[j{i^j\i,j e/,i^j}U{a//3}„,^ 
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It is clear that in this way we have that all constants a of sort ELEM or ARRAY are interpreted 
in such a way that, if a. is the normal form of a, then 

a^=a^. (4.8) 

Also notice that, by the definition of aJ^ ^ if e is the normal form of rd{a,i), then we have 

rd{a,i)^=e^ (4.9) 

in any case (whether a is in normal form or not). Finally, if wr{c, I , E) is the normal form 
of a, then 

a^ = c^ ^ (7 = and£; = 0); (4.10) 

this is because the only rule that can reduce a must have a as left-hand side and wr{c, I, E) 
as right-hand side (rules are ground irreducible), thus in the rule a — > wr{c, /, E) G Am we 
must have / = 0,E = in case = (recall Definition I4.2r iv)). In more details, sup- 
pose that / and E are not empty and take i G / and e € in corresponding positions. We 
have that rd{c,i)^ = rd-^{c^,i-^) = c^(i^) = a^{i^) = rd-^{a^,i^) = rd{a,i)^ 
(we used the definition of interpretation of a ground term, the fact that rd^ is interpreted 
as functional application and that a-'^ = c-^). Now, since rd{a,i) normalizes to e, apply- 
ing (j4.9p . we get that rd{c,i)^ = e'^, which means, again by (|4.9p . that rd{c,i) normalizes 
to e too (e is in normal form, thus if e is the normal form of rd{c, i), we have that = 
implies e = e). This is contrary to Definition 14. 2f iv) . 

Since A is modular, literals in A are flat. It is clear that all negative literals from A are 
true: in fact, a modular constraint does not contain inequalities between array constants, 
inequalities between index constants are true by construction and inequalities between ele- 
ment constants are true by the hypothesis of the Lemma. Also, if a, /3 are either element 
or array constants in normal form, we have a-'^ ^ fS-^ by construction (in particular, the 
interpretation of different array constants both in normal form differ at index *). Let us 
now consider positive literals in A: those from Am are equalities of terms of sort ELEM or 
ARRAY and consequently are of the kind 

e = d, a = wr{c, /, E), rd{a, i) = e. 

Since ground rules are irreducible, d is the normal form of e and wr{c, I , E) is the normal 
form of a, hence we have e-'^ = d-'^ and = wr{c, I, E)^ by (j4.8p above. For the 
same reason a and e are in normal form in rd{a, i) = e, hence rd{a, i)-'^ = follows by 
construction. 

It remains to define diff-'^ in such a way that flat literals diff(a,6) = i from Aj are 
true and the axiom (|3.3p is satisfled. Before doing that, let us observe that for all free 
constants o, b occurring in A, we have that = is equivalent to a J,^^ b. In fact, one 
side is by (14. Sp : for the other side, suppose that a-^ = b-'^ and that wr{c, I, E), wr{c', I', E') 
are the normal forms of a and 6, respectively. Then c must be equal to c', otherwise and 
b-^ would differ at index *. If either a or 6 is equal to c, trivially a J,^^ b follows from ()4.10p . 
Otherwise, a and b are both reducible in An and since ground rules are irreducible and the 
only rules that can reduce an array constant have the left-hand side equal to that array 
constant, we have that a — )> wr{c, I, E) and b wr{c, /', E') are both rules in Ar: as such, 
they are subject to Condition (iv) from Definition 14.21 First observe that we must have 
that / = /': otherwise, if there is i € I \ I', we could infer the following: (i) by (j4.8p . 
b'^{i) = c-'^{i); (ii) c-'^(z) is the normal form of rd{c,i) by construction; (iii) by = b"^, 
c-'^ (i) is also equal to the normal form of the e having in the list E the same position as i in 
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the list /, contrary to Condition (iv) from Definition 14.21 Since terms are normalized with 
respect to rule (j4.3p . / and /' coincide not only as sets, but also as lists; this means that the 
lists E and E' coincide too (the terms wr{c,I,E), wr{c, I , E') are in normal form and we 
have wr{c, I, E)-^ = wr{c, I, E')^). In more details, let i,e,e be in the A;-th positions in 
the lists I,E,E', respectively. From wr{c, I, E)^ = wr{c,I,E')^, applying rd-^{—,i-^), 
we get e-^ = e"^, i.e. e \,Ar e, which means e = e because wr{c, I, E), wr{c, I , E') are in 
normal form (in particular, their sub-terms e, e are not reducible). In conclusion, a ],Ar b 
holds. 

Among the elements of ARRAY"^ , some of them are of the kind o"'^ for some free constant 
a of sort ARRAY occurring in A and some are not of this kind: we call the former 'definable' 
arrays. In principle, it could be that a-^ = for different a, b, but we have shown that 
this is possible only when a and b have the same normal form. 

We are ready to define dif f-^: we must assign a value diff-^(a,b) to all pairs of arrays 
a, b € ARRAY-^. If a or b is not definable or if there are no a,b defining them such that 
diff(a,6) occurs in Aj, we can easily find diff-^(a, b) so that axiom ()3.3p is true for a, b: 
one picks an index where they differ if they are not identical, otherwise the definition can be 
arbitrary. So let us concentrate into the case in which a, b are defined by constants a, b such 
that the literal dif f (a, b) = i occurs in Aj: in this case, we define dif f-^(a'^, 6"^) to be i: 
Condition (v) from Definition 14.21 (together with the above observation that two constants 
defining the same array in M must have an identical normal form) ensures that the definition 
is correct and that all literals diff(a, 6) = i € Aj becomes true. Finally, axiom (13. Sp is 
satisfied by Condition (vi) from Definition 14.21 and the fact that rd{a, i)^ = rd{b, i)-^ is 
equivalent to rd{a,i) ]^Ar rd{b,i) (to see the latter, just recall (|4.9p ). □ 

Remark 4.4. As we said, the importance of Definition 14.21 lies in Lemma 14.31 and in 
Proposition 14. 51 below. On the other hand, it is not true that if A is modular, then A entails 
(modulo AXaiff) a positive literal t = v iS t v, even in case t,v are ground flat terms. 
As a counterexample, consider A = {rd{a,i) e}; we have A ^AXdm = wr{a,i,e) but 
a wr{a,i,e). However, the proof of Lemma 14.31 shows that the following weaker — but 
still important — property holds: if A is modular and t, v are terms of the same sort occurring 
in A, then A ^AXaiit t = v iS t ^.Ar v. This may look unusual, however recall that our 
aim is not to decide equality by normalization but to have algorithms for satisfiability and 
interpolation. 

4.2. Combining modular constraints. Let A, B be two constraints in the signatures 
S"^, obtained from the signature S by adding some free constants and let S*^ = S'^flS^. 
Given a term, a literal or a formula we call it: 

• AB- common iff it is defined over Tf"; 

• A-local (resp. B -local) if it is defined over T,^ (resp. S^); 

• A-strict (resp. B-strict) iff it is A-local (resp. S-local) but not AB-comuion; 

• AB -mixed if it contains symbols in both {T,^ \ S*^) and (S*^ \ T,'~"); 

• AB-pure if it does not contain symbols in both (T,^ \ S*-") and (S-^ \ S*^). 

(Notice that, sometimes in the literature about interpolation, "A-local" and "i?-local" are 
used to denote what we call here "A-strict" and "5-strict" ) . The following modularity 
result is crucial to justify our interpolation algorithm for AX^iff. 
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Proposition 4.5. Let A = {Ai,Am) and B = {Bj,Bm) be constraints in expanded signa- 
tures S"^, as above (here S is the signature of AX^iff); let A, B be both consistent and 
modular. Then AU B is consistent and modular, in case all the following conditions hold: 
(O) an AB-common literal belongs to A iff it belongs to B; 

(I) every rewrite rule in Am U Bm whose left-hand side is AB-common has also an AB- 
common right-hand side; 
(II) if a,b are both AB-common and diff (a, 6) = i € Aj U Bj, then i is AB-common too; 
(III) if a rewrite rule of the kind a — > wr{c,I,E) is in Am U Bm and the term wr{c,I,E) 
is AB-common, so is the constant a. 

Proof. Since we cannot rewrite ^S-common terms to terms which are not, it is easy to 
see that Am ^ Bm is stih convergent and ground irreducible; the other conditions from 
Definition 14.21 are trivial, except condition (v). The latter is guaranteed by the hypotheses 
(II)-(III) as follows: the relevant case is when, say diff(a,6) = i £ Aj is A-local and 
diff(a',6') = i' G Bf is i?-local. If a J, a', since Am and Bm are ground irreducible, we 
have that a single rewrite step reduces both a and a' to their normal form, that is we have 

a — )■ wr{c, /, E) a. 

Now wr{c,I,E) is AB-common, because the rules a — )> wr{c,I,E),a' — )• wr{c, I , E) are in 
Am and in Bm, respectively. By hypothesis (III), we have that a and a' are Ai?-common 
too; the same applies to b, b' and hence to i,i' by (II). Thus dif f (a', b') = i' is Ai?-common 
and belongs to Aj, hence i = i' because A is modular. 

Since all conditions from Definition 14.21 are satisfied, AU B is modular. Lemma 14.31 
applies, thus yielding consistency. □ 

The above proof is so easy mainly because ground rewrite rules cannot superpose with 
the non ground rewrite rules (14.ip - (j4.4p (with the exception of the rewrite rules e — > d, that 
may superpose but with trivially confluent critical pairs): this is the main benefit of our 
choice of orienting equalities a = wr{b, I , E) from left-to-right (and not from right-to-left). 

We conclude this section with a remark about the combination of modular constraints 
in AX^iff with constraints in other theories. The theory AXa±±f is stably infinite (in all 
its sorts) but non-convex: this means that it is suitable for Nelson-Oppen combination, but 
that disjunctions of equalities (not just equalities) need to be propagated from an AX^iff- 
constraint, in case it is involved in a combined problem. Actually, this does not happen 
for modular constraints, as it is shown by the statement of Lemma 14.31 In other words, 
no disjunction of equalities needs to be propagated from a modular constraint A and only 
equalities that can be syntactically extracted from A need to be propagated. 

5. A Solver for Arrays with diff 

The first step towards the quantifier-free interpolation procedure for ^^diff is the design 
of a satisfiability solver. Although a solver for this theory can be easily derived from 
existing solvers for A-V or AXgy^t, we need a specific algorithm from which interpolants 
can be extracted. To do this. Lemma 14.31 will play an important role by allowing for the 
design of 3-equivalence preserving transformations that, once successively applied to a given 
constraint A, will bring it to a consistent modular constraint (if possible). Failure of applying 
these transformations implies that A is unsatisfiable. In other words, the 3-equivalence 
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preserving transformations will determine whether a finite constraint A is satisfiable or not 
by transforming it into a modular 3-equivalent constraint. 

One of the key design choice underlying our transformations is to separate the "index" 
part, that will be handled by guessing, of a constraint from the "array" and "elem" parts, 
that will be subject to rewriting. Another important design decision is to distinguish a 
preprocessing and a completion phase. In the preprocessing phase, besides flattening (see, 
e.g., [2]) and similar operations, a complete guessing of equalities/inequalities among index 
constants will be performed. Indeed, this guessing will be realized by backtracking: if 
the completion phase will terminate in a failure, another guessing has to be tried and 
unsatisfiability can only be declared when all guessing fail. The completion phase will 
guarantee the confluence of the current rewriting system Ar, recall Definition 14.21 The 
confluence of Aji is the main requirement for a constraint to be modular. 



5.1. Preprocessing. The preprocessing phase consists of the following sequential steps 
applied to our initial constraint A: 



Step 1 Flatten A, by replacing sub-terms with fresh constants and by adding the related 
defining equalities. 

Step 2 Replace array inequalities a / 6 by the following literals {i,e,d are fresh) 
diff(a, 6)=i, rd{b,i) = e, rd{a,i) = d, d ^ e. 



Step 3 Guess a partition of index constants, i.e., for any pair of indexes i,j add either 
i = j or i ^ j (but not both of them); then remove the positive literals i = j hy replacing 
i by j everywhere (if i > j according to the symbol precedence, otherwise replace j by 
i); if an inconsistent literal i ^ i is produced, try with another guess (and if all guesses 
fail, report unsat). 



Step 4 For all a, i such that rd{a, i) = e does not occur in the constraint, add such a literal 

rd{a, i) = e with fresh e. 
At the end of the preprocessing phase, we get a finite set of flat constraints; the disjunction 
of these constraints is 3-equivalent to the original constraint. For each of these constraints, 
go to the completion phase: if the transformations below can be exhaustively applied (without 
failure) to at least one of the constraints, report sat, otherwise report unsat. Failure can 
be caused by instructions (V) below. 

The reason for inserting Step 4 above is just to simplify Orientation and Gaussian 
completion below. Notice that, even if rules rd{a, i) ^ e can be removed during completion, 
the following invariant is maintained: terms rd{a, i) always reduce to constants of sort 
ELEM. 



5.2. Completion. The completion phase consists in various transformations that should 
be non-deterministically executed until no rule or a failure instruction applies. For clarity, 
we divide the transformations into five groups. 

(I) Orientation. This group contains a single instruction: get rid of badly orientable 
equalities, by using the equivalences Refiexivity and Symmetry of Figure [U a badly ori- 
entable equality a = wr(b, I , E) (with a < b), after normalization of the term wr(J),I,E) 
with respect to the non-grund rules (14. 3p — (14. 4p . is replaced by an equality of the form 
b = wr{a, I, D) and by the equalities rd{a, I) = E (all "read literals" required by the 
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left-hand side of Symm comes from the above invariant). A badly orientable equality 
a = wr{a, I , E) is removed and replaced by read literals only (or by nothing if I,E are 
empty). 

(II) Gaussian completion. We now take care of the confluence of (i.e., point (iii) 
of Definition 14. 2p . To this end, we consider all the critical pairs that may arise among our 
rewriting rules (j4.5p - (|4.7p (recall that there is no need to examine overlaps involving the non 
ground rules (|4.ip - (j4.4p ). To treat the relevant critical pairs, we combine standard Knuth- 
Bendix completion for congruence closure with a specific method ("Gaussian completion") 
based on equivalences Symmetry, Transitivity and Conflict of Figure [TJ The critical pairs 
are listed below. Two preliminary observations are in order. First, we normalize a critical 
pair by using — before recovering convergence by adding a suitably oriented equality and 
removing the parent equalities (the symbol -^^^ denotes the reflexive and transitive closure 
of the rewrite relation induced by the rewrite rules Aji U { (|4.ip — (|4.4p }). Second, the 
provisos of all the equivalences in Figure [1] used below (i.e., Symm, Trans, and Confl) are 
satisfied because of the pre-processing Step 3 above. 



(CI): wr{bi,Ii,Ei) wr{b[, I[, E[) ^ a ^ wr{b'2, 1!^, E'^) wr{b2, h, E2 



with 61 > 62- We proceed in two steps. First, we use Symm (from right to left) to 
replace the parent rule a wr{b'i, ![, E[) with 

wr{a,Ii,F) = 61 A rd{a,Ii) = Ei 

for a suitable list F of constants of sort ELEM (notice that the equalities rd{bi,Ii) = F, 
which are required to apply Symm, are already available because terms of the form 
rd{bi,j) for j in Ii always reduce to constants of sort ELEM by the invariant resulting 
from the application of Step 4 in the pre-processing phase). Then, we apply Trans to 
the previously derived equality 61 = wr{a, Ii, F) and to the normalized second equality 
of the critical pair (i.e., a = wr{b2, 12, E2)) and we derive 

61 = wr{b2,l2 ■Ii,E2- F)Aa = wr{b2,l2, E2). (5.1) 

Hence, we are entitled to replace bi = wr{a, Ii, F) with the rule 61 wr{b2, J, D), where 
J and D are lists obtained by normalizing the right-hand-side of the first equality of (|5.ip 
with respect to the non-ground rules (j4.3p and (j4.4p . To summarize: the parent rules are 
removed and replaced by the rules 

bi ^ wr{b2,J,D), a ^ wr{b2,l2,E2) 

and a bunch of new equalities of the form rd{a, i) = e, giving rise, in turn, to rules of the 
form rd{b2,i) — ?> e or to rewrite rules of the form (j4.7p after normalization of their left 
members (normalization of terms rd{a, i) is indeed needed for the termination argument 
of Theorem lS.lI b elow to work). 



(C2): wr{b,Ii,Ei) wr{b[, I[, E[) ^ a ^ wrib'^, H2, E'^) wr{b, I2, E2) 



Since identities like wr{c, H,G) = wr{c,TT{H),TT{G)) are ^^Af^iff -valid for every permu- 
tation TT (under the proviso Distinct{H)), it is harmless to suppose that the set of index 
variables I = Ii I2 coincides with the common prefix of the lists Ii and I2', hence we 
have Ii = I ■ J and I2 = I ■ H for suitable disjoint lists J and H. Then, let E and E' be 
the prefixes of Ei and E2, respectively, of length equal to that of /; and let Ei = E ■ D 
and E2 ^ E' ■ F for suitable lists D and F. At this point, we can apply Confl to replace 
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both parent rules forming the critical pair with 

a = wr{b, I,E) AE = E' A rd{b, J) = D A rd{b, H) = F, 

where the first equality is oriented from left to right (i.e., a — )• wr{b, I , E)). 
(Ill) Knuth-Bendix completion. The remaining critical pairs are treated by standard 
completion methods for congruence closure. 



(C3): d rd{wr{b, I, E),i) ^ rd{a, i) e' 



Remove the parent rule rd{a, i) — >■ e' and, depending on whether d > e,e > d, or d = e, 
add the rule d ^ e, e ^ d, or do nothing. (Notice that terms of the form rd{b,j) are 
always reducible because of the invariant of Step 4 in the pre-processing phase; hence, 
rd{wr{b, I , E),i) always reduces to some constant of sort ELEM.) 



(C4): e ^,-(— e' ^ rd{a, i) d' d 



Orient the critical pair (if e and d are not identical), add it as a new rule and remove 
one parent rule. 



(C5): d *■(— d' ^ e — )• d'^ -^^^ di 



Orient the critical pair (if d and di are not identical), add it as a new rule and remove 
one parent rule. 

(IV) Reduction. The instructions in this group simplify the current rewrite rules. 
(Rl): If the right-hand side of a current ground rewrite rule can be reduced, reduce it as 

much as possible, remove the old rule, and replace it with the newly obtained reduced 

rule. Redundant equalities like t = t are also removed. 
(R2): For every rule a wr{b, I, E) S Am, after normalization of the term wr{b, I, E) with 

respect to the non-grund rules (j4.3p — (|4.4p , exhaustively apply Reduction in Figure [1] 

from left to right (this amounts to do the following: if there are i,e in the same position k 

in the lists /, E such that rd{b, i) ],Ar e, replace a = wr{b, I, E) with a = wr{b, I—k, E—k)). 
(R3): If diff(a,6) = i £ Aj, rd{a,i) rd{b,i) and a > b, add the rule a — >■ 6; replace 

also diff(a,6) = i by diff(6, 6) = i (this is needed for termination, it prevents the rule 

for being indefinitely applied). 

(V) Failure. The instructions in this group aim at detecting inconsistency. 

(Ul): If for some negative literal e ^ d £ Am we have e d, report failure and 

backtrack to Step 3 of the pre-processing phase. 
(U2): If {diff(a, 6) = i,d±ii{a' ,b') = i'} C Aj and a o! and b ]^Ae ^' for ^ ^ ^'i report 

failure and backtrack to Step 3 of the pre-processing phase. 

Notice that the instructions in the last two groups may require a confluence test a Xa^ /3 
that can be effectively performed in case the instructions from groups (II)- (III) have been 
exhaustively applied, because then all critical pairs have been examined and the rewrite 
system Ar is confluent. If this is not the case, one may pragmatically compute and compare 
any normal form of a and /3, keeping in mind that the test has to be repeated when all 
completion instructions (II)-(III) have been exhaustively applied. 

Theorem 5.1. The above procedure decides constraint satisfiability in AXii_±f. 

Proof. Correctness and completeness of the solver are clear: since all steps and instructions 
from Section [5] manipulate the constraint up to 3-equivalence, it follows that if all guessings 
originated by Step 3 fail, the input constraint is unsatisflable and, if one of them succeed. 
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the exhaustive apphcation of the completion instructions leads to a modular constraint 
which is satisfiable by Lemma 14.31 

We must only consider termination; to show that any sequence of our instructions 
terminates, we use a standard technique. With every positive literal I = r we associate 
the multi-set of terms {l,r}; with every negative literal I ^ r, we associate the multi-set 
of terms {l,l,r,r}. Finally, with a constraint A we associate the multi-set M{A) of the 
multi-sets associated with every literal from A. Now it is easy to see that such multi-set 
decreases after the application of any instruction. □ 

The termination analysis in the proof of Theorem 15.11 can be refined so as to show that 
our algorithm is in NP, which is optimal because satisfiability of quantifier free formulae in 
AXext is already NP-complete [10]. 

6. The Interpolation Algorithm for Arrays with diff 

In the literature one can roughly distinguish two approaches to the problem of computing 
interpolants. In the former (see e.g. |111I47| ). an interpolating calculus is obtained from 
a standard calculus by adding decorations so as to enable the recursive construction of 
an interpolating formula from a proof; in the latter (see, e.g., [22l[28l[56] ) , the focus is on 
how to extend an available decision procedure to return interpolants. Our methodology is 
similar to the second approach, since we add the capability of computing interpolants to 
the satisfiability procedure in Section [5l However, we do this by designing a flexible and 
abstract framework, relying on the identification of basic operations that can be performed 
independently from the method used by the underlying satisfiability procedure to derive a 
refutation. 

6.1. Interpolating Metarules. Let now A,B he constraints in signatures ex- 
panded with free constants and S*^ = n E^; we shall refer to the definitions of AB- 
common, >l-local, i?-local, j4-strict, -B-strict, AB-mixed, AB-puie terms, literals and for- 
mulae given in Section [H Our goal is to produce, in case A A B is ^A'diff-unsatisfiable, a 
ground ^i?-common sentence <j) such that A ^AXditt 4' <j) A B is .A^diff-unsatisfiable. 

Let us examine some of the transformations to be applied to A, B. Suppose for instance 
that the literal ip is Ai?-common and such that A ^AXdiii then we can transform B into 
B' = B U {^p}■ Suppose now that we got an interpolant (j) for the pair A,B': clearly, 
we can derive an interpolant for the original pair A, B by taking (p A ip. The idea is to 
collect some useful transformations of this kind. Notice that these transformations can also 
modify the signatures T,^,Tj^, in the sense that the signature of the pair A',B' obtained 
after applying a single transformation to a pair A, B might be different from the signature 
of A,B (typically, the signature of A',B' may contain extra fresh constants). For instance, 
suppose that t is an ^i?-common term and that c is a fresh constant; then we can put 
A' = Au{c = t}, B' = Bu{c = t}: in fact, if (j) is an interpolant for A' ,B' , then (j){t/c) is an 
interpolant for A, B. (Notice that the fresh constant c is now a shared symbol, because T,^ is 
enlarged to S^Ujc}, is enlarged to S^U{c} and hence (S^U{c})n(S^U{c}) = S'^U{c}.) 
The transformations we need are called metarules and are listed in Table [U below (in the 
Table and more generally in this Subsection, we use the notation (f)\- ^ for (j) ^AX^^tf V')!! 



Rules Redplusl, Redplus2 can be seen as instances of Rules Disjunctionl, Disjunction2 (for n — 1), 
thus they are redundant. In Rule Propagatel, one can change the proviso to the weaker requirement ^tp £ A 
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An interpolating metarules refutation for ^4, i? is a labelled tree having the following 
properties: (i) nodes are labelled by pairs of finite sets of constraints; (ii) the root is labelled 
by A, B; (iii) the leaves are labelled by a pair A, B such that _L G ^ U -B; (iv) each non-leaf 
node is the conclusion of a rule from Tabled] and its successors are the premises of that rule. 
The crucial properties of the metarules are summarized in the following two Propositions. 

Proposition 6.1. The unary metarules ^, j ^, from Table[l\ have the property that A A B 
is 3-equivalent to A' A B' ; similarly, the n-ary metarules ^ "|' _b" ^ ^" fr^^ Table [1] 
have the property that A A B is 3-equivalent to Vfc=i(^fc ^ ^k)- 

Proposition 6.2. // there exists an interpolating metarules refutation for A, B then there 
is a quantifier-free interpolant for A,B (namely there exists a quantifier-free AB-common 
sentence (j) such that A \- (j) and B A (j) h _Lj. The interpolant (p is recursively computed 
applying the relevant interpolating instructions from Table [IJ 

The proofs of both Propositions 16.11 and 16.21 are straightforward. The following obser- 
vations are the basis of such proofs. The metarules are applied bottom-up whereas inter- 
polants are computed (from an interpolating refutation) in a top-down manner. We should 
have labelled nodes in an interpolating metarules refutation by 4-tuples (S^, A, E^, i?), 
where S"^, T,^ are signatures expanded with free constants, A is a S'^-constraint and B is 
a S^-constraint. The shared signature of the node labelled (S^, ^4, S^, B) (i.e. the signa- 
ture where interpolants are recursively computed) is taken to be S*^ = T,^ n S-^; the root 
signature pair is the pair of signatures comprising all symbols occurring in the original pair 
of constraints. We did not make all this explicit in order to avoid notation overhead. No- 
tice that the only metarules that modify the signatures are (DefineO), (Definel), (Define2) 
(which add a to T,^ n E^, T,^, E^, respectively). Some other rules like (ConstElimO), (Con- 
stEliml), (ConstElim2) could in principle restrict the signature, but signature restriction 
is not relevant for the computation of interpolants: there is no need that all ^i?-common 
symbols occur in the interpolants, but we certainly do not want extra symbols to occur in 
them, so only bottom-up signature expansion must be tracked. 

6.2. The Interpolating Solver. The metarules are complete, i.e. ii A A B is ^Af^iff- 
unsatisfiable, then (since we know that an interpolant exists) a single application of (Prop- 
agatel) and (Close2) gives an interpolating metarules refutation. This observation shows 
that metarules are by no means better than the brute force enumeration of formulae to 
find interpolants. However, metarules are useful to design an algorithm manipulating pairs 
of constraints based on transformation instructions. In fact, each of the transformation 
instructions can be justified by a metarule (or by a sequence of metarules): in this way, if 
our instructions form a complete and terminating algorithm, we can use Proposition 16.21 to 
get the desired interpolants. The main advantage of using metarules as justifications is that 
we just need to take care of the completeness and termination of the algorithm, and not 
about interpolants anymore. Here "completeness" means that our transformations should 
be able to bring a pair {A,B) of constraints into a pair (A',B') that either matches the 
requirements of Proposition 14.51 or is explicitly inconsistent, in the sense that _L G A' U B'. 



and tp is ^_B-common' (the case A \- ^j) could be obtained by applying Redplusl); a similar observation 
applies to Propagate2. We thank an anonymous referee for these remarks. 
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Closel 


Close2 


Propagatel 


Propagate2 








A\BU {-0} 


A U {i/j} 1 5 


A 1 B 




A 1 B 


A 1 B 


A 1 B 


Prv.: A is unsat. 
Int.: 0' = -L. 


Pr?;.: B is unsat. 
Int.: 4)' = T. 


Prv.: A\- ip and 

^ is AS-common. 
Int.: = A 


Prv.: B \- Ip and 

'ip is AB-common. 
/ni.: (p' = Ip ^ (p. 


DefineO 


Define 1 


Define2 


A\j{a = t} \ Byj{a 


= n 


AU{a = 


: t} 1 B 


A 1 B U {o = i} 


A 1 B 




A 1 


B 




A 1 B 


Prv.: t is AB-common, a fresh. 
Int.: 4>' = (f){t/a). 


Prv.: t is A-local 
Int.: <t>' = 4>. 


and a is fresh. 


Prv.: 
Int.: 


i is B- local and a is fresh. 

0' = 0. 


Disjunction! 


Disjunction2 


■•■ Au{ibk} \B ■■■ 


■ ■ ■ A\BU {Vfc} ■ ■ ■ 


A 


1 ^ 






A 


1 B 


Prv.: Vfc— 1 V'fc is A-local and A h Vfc— i V'fc- 
Int.: 0' = V^=i0fc. 


Prv.: Vfc 
/ni. : (f)' 


^ Va; is B-local and B h \/^ tpj^. 


Redplusl 


Redplus2 


Redminusl 


Redminus2 


AufV'} 1 B 


A 1 B[J{iIj} 


A 1 B 


A 1 B 


A 1 B 




A 1 B 


A U {V-} 1 B 


A\B[J {V-} 


Prv.: A\- ^ and 

-0 is A-local. 
/nt.: = 4>. 


Prv. 
Int. 


: B \- ijj and 

is S-local. 
: 4>' = ^. 


Prv.: A h -0 and 

-0 is A-local. 
Int.: 0' = 0. 


Prv.: B \- Ip and 

i/j is S-local. 
Int.: 0' = 0. 


ConstEliml 


ConstElim2 


ConstElimO 


A 1 B 


A 1 




A 1 B 


AU{a = t} \ B 




A 1 BU{6 = t} 


AU{c = t} \ BU{c = t} 


Prv.: a is A-strict and 

does not occur in / 
Int.: 0' = (p. 


,t. 


Prv.: b is B-strict and 

does not occur in B, t. 
Int.: 4>' = 4>. 


Prv.: c, 
c 

int.: 0' 


t are A5-common, 
does not occur in A, 

= 0. 



Table 1: Interpolating Metarules: each rule has a proviso Prv. and an instruction Int. for recursively 
computing the new interpolant (j)' from the old one(s) (p, (pi, . . . , (j>k- 

The latter is obviously the case whenever the original pair {A, B) is ^A'diff-unsatisfiable 
and it is precisely the case leading to an interpolating metarules refutation. 

The basic idea is that of invoking the algorithm of Section [5] on A and B separately 
and to propagate equalities involving AB-covamon terms. We shall assume an ordering 
precedence making AB-common constants smaller than A-strict or B-strict constants of 
the same sort. However, this is not sufficient to prevent the algorithm of Section [5] from 
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generating literals and rules violating one or more of the hypotheses of Proposition 14.51 this 
is why the extra correcting instructions of group (7) below are needed. Our interpolating 
algorithm has a pre-processing and a completion phase, like the algorithm from Section [5j 

Pre-processing. In this phase the four Steps of Section 15.11 are performed on both A 
and B; to justify these steps we need metarules (Define0,l,2), (Redplusl,2), (Redminusl,2), 
(Disjunction!, 2), (ConstElimO,l,2), and (Propagatel,2) — the latter because if i,j are AB- 
common, the guessing of i = j versus i ^ j in Step 3 can be done, say, in the A-component 
and then propagated to the i?-component. At the end of the preprocessing phase, the 
following properties (to be maintained as invariants afterwards) hold: 

(11) : A (resp. B) contains i ^ j for all ^- local (resp. i?-local) constants i,j of sort INDEX 

occurring in A (resp. in B); 

(12) : if a,i occur in A (resp. in B), then rd{a,i) reduces to an yl-local (resp. S-local) 

constant of sort ELEM. 

Completion. Some groups of instructions to be executed non-deterministically constitute 
the completion phase. There is however an important difference here with respect to the 
completion phase of Section 15.21 it may happen that we need some guessing also inside the 
completion phase (only the instructions from group (7) below may need such guessings). 
Each instruction can be easily justified by suitable metarules (we omit the straightforward 
details). The groups of instructions are the following: 

(a) Apply to A or to B any instruction from the completion phase of Section 15.21 
(/?) If there is an Ai?-common literal that belongs to A but not to B (or vice versa), copy 
it in B (resp. in A). 

(7) Replace undesired literals, i.e., those violating conditions (I)- (II)- (III) from Proposi- 



To avoid trivial infinite loops with the (/3) instructions, rules in (a) deleting an Ai?-common 
literal should be performed simultaneously in the A- and in the i?-components (it can be 
easily checked - see the proof of Theorem 16.31 below - that this is always possible, if rules 
in {(3) and (7) are given higher priority). 

Instructions (7) need to be described in more details. Preliminarily, we introduce a 
technique that we call Term Sharing. Suppose that the ^-component contains a literal 
a = t, where the term t is ^i3-common but the free constant a is only ^-local. Then it 
is possible to "make a ^i?-common" in the following way. First, introduce a fresh AB- 
common constant a' with the explicit definition a' = t (to be inserted both in A and in B, 
as justified by metarule (DefineO)); then replace the literal a = t hy a = a' and replace a 
by a' everywhere else in A; finally, delete a = a' too. The result is a pair (A, B) where 
basically nothing has changed but a has been renamed to an ^i?-common constant a'. 
Notice that the above transformations can be justified by metarules (DefineO), (Redplusl), 
(Redminusl), (ConstEliml). We are now ready to explain instructions (7) in details. First, 
consider undesired literals corresponding to the rewrite rules of the form 



in which the left-hand side is ^i^-common and the right-hand side is, say, A-strict. If we 
apply Term Sharing, we can solve the problem by renaming d to an AS-common fresh 
constant d'. We can apply a similar procedure to the rewrite rules 



tionl45l 



rd{c, i) d 



(6.1) 



a wr{c, I, E) 



(6.2) 
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in case the right-hand side is y4i?-common and the left-hand side is not; when we rename 
a to some fresh AB-comiaow constant c', we must arrange the precedence so that c' > c to 
orient the renamed hteral as c' — )• wr{c, I , E). Then, consider the hterals of the form 

diff(a,6) = A; (6.3) 

in which the left-hand side is Ai?-common and the right-hand side is, say, A-strict. Again, 
we can rename k to some AiJ-common constant k' by Term Sharing. Notice that k' is 
A5-common, whereas k was only A-local: this implies that we might need to perform 
some guessing to maintain the invariant (il). Basically, we need to repeat Step 3 from 
Section [5.11 till invariant (il) is restored (k' must be compared for equality with the other 
-B-local constants of sort INDEX). The last undesired literals to take care of are the rules of 
the form 

wr{c',I,E) (6.4) 
having an ^i?-common left-hand side but, say, only an ^- local right-hand side (literals of 
the form d = e are automatically oriented in the right way by our choice of the precedence) . 
Notice that from the fact that c is ^dS-common, it follows (by our choice of the precedence) 
that c' is ^S-common too. We can freely suppose that I and E are split into sub-lists 
/i,/2 and Ei,E2, respectively, such that I = Ii ■ I2 and E = Ei ■ E2, where Ii,Ei are 
AS-common, I2 = ii, . . . , in, E2 = ei, . . . , e„ and for each k = 1, . . . ,n at least one from 
ik,ek is not Ai?-common. This n (measuring essentially the number of non Ai?-common 
symbols in (j6.4p ) is called the degree of the undesired literal (|6.4p : in the following, we shall 
see how to eliminate (j6.4p or to replace it with a smaller degree literal. We first make a 
guess (see metarule (Disjunction!)) about the truth value of the literal c = wr{c', Ii,Ei). In 
the first case, we add the positive literal to the current constraint; as a consequence, we get 
that the literal ()6.4p is equivalent to c = wr{c, I2, E2) and also to rd{c,l2) = E2 (see Red 
in Figure [1]). In conclusion, in this case, the literal ()6.4p is replaced by the ^i?-common 
rewrite rule c — t- wr^c' , Ii, Ei) and by the literals rd{c,l2) = E2. In the second case, we 
guess that the negative literal c 7^ wr{c' , Ii, Ei) holds; we introduce a fresh ^i?-common 
constant c" together with the defining ^5-common literal^ 

c" ^ wr{c',Ii,Ei) (6.5) 

(see metarule (DefineO)). The literal (|6.4p is replaced by the literal 

wr{c",l2,E2). (6.6) 

We show how to make the degree of ()6.6p smaller than n. In addition, we eliminate the 
negative literal c 7^ c" coming from our guessing (notice that, according to (16. 5p . c" renames 
wr{c' , II, El)). This is done as follows: we introduce fresh ^B-common constants i,d,d" 
together with the AS-common defining literals 

dif f (c, c") = i, rd{c, i) d, rd{c",i) d" (6.7) 

(see metarule (DefineO)). Now it is possible to replace c 7^ c" by the literal d 7^ d" (see 
axiom (j3.3p ). Under the assumption Distinct{l2), the following statement is AXa±ff valid: 

n 

c = wr{c", h, E2) A rd[c", i) = d" A rd{c, i) = d A d d" \J [i = i,^ A d = e^). 

k=l 

^We put c > c" > c in the precedence. Notice that invariant (12) is maintained, because all terms 
rd{c" , h) normalize to an element constant. In case 7i is empty, one can directly take c' as c". 
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Thus, we get n alternatives (see metarule (Disjunction!)). In the k-ih alternative, we 
can remove the constants ik,Gk from the constraint, by replacing them with the AB- 
common terms (i respectively (see metarules (Redplusl), (Redplus2), (Redminusl), (Red- 
minus2),(ConstEliml),(ConstElimO)); notice that it might be necessary to complete the 
index partition. In this way, the degree of (|6.6p is now smaller than n. 

In conclusion, if we apply exhaustively Pre-Processing and Completion instructions 
above, starting from an initial pair of constraints {A,B), we can produce a tree, whose 
nodes are labelled by pairs of constraints (the successor nodes of a node labelled {A, B) are 
labelled by pairs of constraints that are obtained from {A,B) by applying an instruction). 
Notice that the branching in the tree is due to instructions that need guessing and that 
Pre-Processing instructions are applied only in the initial segment of a branch. We call 
such a tree an interpolating tree for {A,B). The following result shows that we obtained an 
interpolation algorithm for ^A:'diff • 

Theorem 6.3. Any interpolation tree for {A, B) is finite; moreover, it is an interpolating 
metarules refutation (from which an interpolant can he recursively computed according to 
Proposition 1 6. ^|) precisely iff A A B is AX^±ff -unsatisfiable. 

Proof. Since all instructions can be justified by metarules and since our instructions bring 
any pair of constraints into constraints which are either manifestly inconsistent (i.e. contain 
_L) or satisfy the requirements of Proposition 14.51 the second part of the claim is clear. We 
only have to show that all branches are finite (then Konig lemma applies). 

A complication that we may face here is due to the fact that during instructions (7), 
the signature is enlarged. However, notice that our instructions may introduce genuinely 
new ^i?-common array constants, however they can only rename index constants, element 
constants and non AB-common array constants. Moreover: (1) Term Sharing decreases the 
number of the constants which are not AB-couimon; (2) each call in the recursive procedure 
for the elimination of literals (j6.4p . either (2.i) renames to Ai?-common constants some 
constants which were not AB-common before, or (2.ii) just replaces a literal of the kind 
c = wr{c' , Ii ■ l2,Ei ■ E2) by the literals 

c = wr{c',h,Ei), rd{c',l2) = E2 

(see the first alternative following the guessing about truth of the literal c = wr{c' , Ii, Ei)). 
Since there are only finitely many non Ai?-common constants at all, after finitely many 
steps neither Term Sharing nor (2.i) apply anymore. We finally show that instructions (a), 
(/3) and (2.ii) (that do not enlarge the signature) cannot be executed infinitely many times 
either. To this aim, it is sufficient to associate with each pair of constraints {A, B) the 
complexity measure given by the multi-set of pairs (ordered lexicographically) {m{L), Nl) 
(varying L € AuB), where m{L) is the multi-set of terms associated with the literal L and 
Nl \s I \i L e A\B, 2 \i L ^ B\A, and if L G ^4 n B. In fact, the second component 
in the above pairs takes care of instructions (/3), whereas the first component covers all the 
remaining instructions. Notice that it is important that, whenever an Ai?-common literal 
is deleted, the deletion happens simultaneously in both components (otherwise, the (/3) 
instruction could re-introduce it, causing an infinite loop; our complexity measure does not 
decrease if an AB-common literal is replaced by smaller literals only in the A- or in the 
B-component): in fact, it can be shown (by inspecting the instructions from the completion 
phase of Subsection 15. 2p that whenever an AB-common literal is deleted, the instruction 
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that removes it involves only 74i?-common literals, if undesired literals are removed first Q 
Thus, if instructions in (/3) and (7) have priority (as required by our specifications above), 
AS-common literal deletions caused by (a) can be performed both in the A- and in the B- 
component (notice also that the instructions from (/3) and (2ii) do not remove ^i?-common 
literals). □ 

From the theorem above it immediately follows Theorem 13.31 that we have already 
proved in Section 13.11 by using model-theoretic notions (thus in a non-constructive way) . 



6.3. An Example. To illustrate our method, we describe the computation of an interpolant 
for the problem 

n = (^0, Bo) 

where 

^0 = { a = wr{b,i, d) } 

Bo = { rd{a,j) / rdibj), rd{a,k) / rd{b,k), j ^ k }. 

Notice that i, d are ^-strict constants, j, k are -B-strict constants, and a, b are AB- 
common constants with precedence a > b. The computation of the interpolant in our 
framework can be represented with a tree, growing upward from 11, in which each step can 
be identified with a set of appropriate metarules application. 

To begin with we first apply Pre-Processing instructions to obtain 

Ai = { a = wr{b, i, d), rd{a, i) = 65, rd{b, i) = ee } 

Bi = {rd{a,j) = d, rd{b,j) = 62, rd{a,k) = 63, rd{b,k) = 64, ei 7^ 62, 63 7^ 64, i / A;}. 

Since a = wr{b,i,d) is an undesired literal of the kind (j6.4p . we generate the two sub- 
problems 

Hi = {AiU {rd{b,i) = d, a = b}, Bi), and 
U2 = {AiU{a^b}, Bi) 

(this is precisely the case in which there is no need of an extra AS-common constant c"). 

Let us consider Hi first. Notice that A \- a = b, and a = b is AS-common. Therefore 
we send a = b to Bi, and we may derive the new equality ei = 62 from the critical pair 
(C3) ei rd{a,j) — > rd{b,j) — )• 62, thus obtaining 

A2 = { rd{b,i) = d, a = b, rd{a,i) = 65, rd{b,i) = cq } 

B2 = {rd{b,j) = 62, rd{a,k) = 63, rd{b,k) = 64, ei 7^ 62, 63 7^ 64, j 7^ fc, a = 6, d = 62}. 

Now B is inconsistent (as it contains both ei 7^ 62 and ei = 62). The interpolant for Hi 
can be computed with the interpolating instructions of the metarules (Close2, Redplus2, 
Redmius2, Propagatel) resulting in 

ifi = a = b 

as shown in Figure [2l 

^Let us see an example by considering instruction (C3). This instruction removes a literal rd{a,i) — ^ e' 
using a literal a — > wr{b, I , E) (and possibly rewrite rules rd{b,i) — >■ d' as well as rewrite rules that might 
reduce some of the e',d',E). Now, if rd{a,i) — >■ e' is AB-common and all the other involved rules are not 
undesired literals, the instruction as a whole manipulates ylB-common literals. As such, if (/3) has been 
conveniently applied, the instruction can be performed simultaneously in the A- and in the B-component 
and our specification is precisely to do that. 
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Close2 






T 




B{U{a 


= b, ei 


= 62 } 


Redminus2 






T 




BiU{a 


= b, ei 


= 62} 


Rcdplus2 






T 




BiU{a 


= b} 




Propagatel 






a = b 


Ai U {rd{b,i) = d, a = b} 


Bi 






where 








B[ = Bi\{rd{b,j) = e2} 









Figure 2: Interpolant derivation for Hi using metarules. The derivation is to be read 
bottom-up. The labels for the rules are shown on the left, while the partial 
interpolants, computed top-down, are shown on the right. 



Then, let us consider branch 112. Recall that this branch originates from the attempt 
of removing the undesired rule a wr{b,i,d). We introduce, in both A and B, the AB- 
common defining literals diff(a,6) = l,rd{a,l) = fi,rd{b,l) = f2- In order to remove 
a ^ b, we introduce /i ^ /2 in A, which is propagated to B, thus obtaining: 

^3 = { a = wr{b,i,d), 

diff(a,6) = /, rd{a,l) = /i, rd{b,l) = /s, /i / /2 } 
^3 = { rd{aj) = ei, rd{b,j) = 62, rd{a,k) = 63, rd{b,k) = 64, 

ei ^ 62, 63 ^ 64, j / k, 

dif f (a, 6) = /, rd{a, I) = /i, rd{b, I) = /a, /i ^ /2 }. 

Since a = wr{b, i, d) contains only the index i, we do not have a real case split. Therefore we 
replace i with Z, and d with /i. At last, we propagate the ^i?-common literal a = wr{b, I, fi) 
to B. After all these steps we obtain: 

A4 = { a = wr{b,l,fi), 

dif f (a, 6) = /, rd{a,l) = /i, rd{b,l) = /2, /i / /2 } 
B4 = { rd{a,j) = ei, rd{bj) = 62, rd{a,k) = 63, rd{b,k) = 64, 

61 / 62, 63 / 64, j / k, 

dif f (a, 6) = /, rd{a,l) = /i, rd(6,0 = /2, /i ^ /2, 
a = wr{b,l,fi) }. 

Since we have one more Al^-common index constant I, we complete the current index 
constant partition, namely {k} and {j}: we have three alternatives, to let I stay alone 
in a new class, or to add / to one of the two existing classes. In the first alternative, 
because of the following critical pair (C3) 61 rd{a,j) rd{wr{b,l, fi), j) 62, we add 
61 = 62 to B, which becomes trivially unsatisfiable. The other two alternatives yield similar 
outcomes. For each sub-problem the interpolant is T. The partial interpolant for 112 has 
to be reconstructed by the reverse application of the interpolanting instructions of (DefineO) 
and (Propagatel), as shown in Figure [3l which yield 

LP2 = {a = wr{b, dif f (a, 6), rd{a, dif f (o, 6))) A rd{a, dif f (a, 6)) 7^ rd{b, dif f (a, b))). 
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Close2 T 

...\B'^UCU{lj^k,l^j,ei= 62} 

RGdminus2 T 

...\B[UCU{l^k,lj^j,ei=e2} 

Rcdplus2 T 

... \ B[UC'U{lj^k,ljtj} 
Disjunction2 



Propagatcl 

A[U {fi ^ f2, a = wr{b,l, h)} U C 

Redminusl 



AiU{fi^f2,a = wrib,l,fi)}UC 
Redplusl 



^1 U { /i 7^ /2 } U C 

Propagatcl 



AiU{h^ f2}UC 

Redminusl 



Redplusl 



AiU{a^6}UC 

DefineO* 



AiU{a^b} 



B[UC 



BiUCU{/i ^/a} 



Bi UCU{/i 7^ /a} 



BiUCU{/i7^/2} 



Bi UC 



a = wr{b,l, /i) 
a = wr{b, I, fi) 
a = wr{b, I, fi) 
a = wr(b,l,fi) Afi^f2 



BiUC 



BiUC 



a = wr{b,lji) Afiytf2 
a = wr(b,lji) A /i 7^ /2 
V2 



Bi 



where 

C = {diff(a, b) = I, rd{a,l) = fi, rd(b,l) = /2 } 

= Ai \ { a = wr{b, i, d) } 
B[ = Bi\j{hi^ f2,a = wr{bA,h)} 
B'l = B[\{rd{a,j) = e^} 

ip2 = {cL = wr(b, dif f (a, b), rd{a, dif f (a, b))) A rd(a, dif f (a, b)) ^ rd{b, dlf f (a, b))) 



Figure 3: Interpolant derivation for 112 using metarules. The derivation is to be read 
bottom-up. The labels for the rules are shown on the left, while the partial 
interpolants, computed top-down, are shown on the right. 

The final interpolant is computed by combining the interpolants for ITi and 112 by 
means of (Disjunction!), yielding 

if = ipiV ip2 = 

= (a = 6 V (a = wr{b, dif f (a, 6), rd{a, dif f (a, b))) A 
A rd{a,diff{a,b)) / rd{b,diff{a,b))) 
which can be simplified to ip = {a = wr(6, dif f (a, 6), r(i(a, dif f (a, 6)))). 

7. Related work and Conclusions 

There are two main lines of work in the literature which is relevant for our paper: satisfiabil- 
ity procedures for variants and extensions of the theory of arrays and interpolation methods 
related to the theory of arrays. Below, we discuss the works which are more closely related 
to our approach in some details. 

7.1. Satisfiability. Since its introduction by McCarthy in [33], the theory of arrays have 
received a lot of attention in automated theorem proving and verification because of its 
importance in modelling fundamental mechanisms of hardware and software systems such 
as memory read and write operations. For example, a lot of papers have been devoted 
to design, prove correct, and build decision procedures for the satisfiability problem of 
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quantifier-free and selected classes of quantified formulae in (various extensions of) the 
theory of arrays; e.g., [21151 [ra[25l [29 H3n [3811111 [55] . The interested reader is pointed to 
the 'related work' sections of |251I30| for a comprehensive overview. Here, we notice that 
many of them are based on instantiating the axioms of the theory so that rd and wr can 
be considered as uninterpreted functions and state-of-the-art procedures for the theory of 
equality can be used. Notable exceptions are [2ll41[[55] where techniques based on rewriting 
or constraint solving are used. 

In [2], the standard superposition calculus [5] is proven to terminate on the union of 
the theory of arrays and a set of ground literals; thereby, providing a decision procedure 
for the quantifier-free satisfiability problem because of the refutation completeness of the 
calculus. (The efficiency of the approach is explored in [1].) While the saturation (roughly, 
the exhaustive application of the rules of the superposition calculus) can be seen as a gen- 
eralization of completion where clauses, and not only equalities, are handled, our Gaussian 
completioE0 has some distinctive features. In fact, while the three critical pairs (C3), (C4), 
and (C5) in Section [5^2] can be regarded as instances of the inference rules of a superposition 
calculus (see [2] for details), the critical pairs (CI) and (C2), exploiting the equivalences in 
Figure m are impossible to recast in any standard completion procedure (see, e.g., [1]). In 
fact, the way in which the critical pairs (CI) and (C2) are eliminated involves the addition 
of equalities containing rd's (in order to constrain the values stored at certain locations in 
the arrays mentioned in the rules of the critical pair) besides the replacement of one or both 
the parent rewrite rules by an equality. Only in this way, we were able to eliminate badly 
orientable rules. It seems difficult to adapt the approach in [2] to the problem under con- 
sideration mainly because of the chosen order > over terms. In fact, we orient the equality 
a — > wr{b,i,e) from left to right if a > 6, and use the equivalences in Figure [U when b > a 
(or a and b are identical). This allows us to eliminate all critical pairs with rules (|4.ip - (|4.4p 
in Definition 14.21 since such rules contain just one variable of sort ARRAY and, trivially, no 
critical pairs involving the variable should be considered. If we choose the other way of ori- 
enting the equalities of the form a = wr{b, i, e), several critical pairs would arise. Although 
the completion of these pairs terminate under suitable assumptions (as shown in [2]), this 
creates serious problems when considering the computation of interpolants. 

In [H] , a satisfiability procedure for the theory of arrays with extensionality is designed 
so as to be easily combined with other procedures by the Shostak combination method (see, 
e.g., [51]). Two interface functionalities are required by the Shostak combination method: 
(i) normalizing terms and (ii) solving equalities. We consider each activity in details. 

(i) In Chapter 5 of [41], a canonical form for terms built out by using a single rd or 
several wr's is defined by using a simplification ordering. The canonical terms are 
similar to those occurring in a modular constraint according to Definition 14.21 above. 
A major difference is the use of if-then-else's to normalize read-terms in [41] while our 
procedure does not use them because item (i) of Definition 14.21 implies that any two 
indexes in a constraint in normal form are known to be distinct. This choice makes the 
proof of the correctness of our procedure much easier with respect to the argument for 
the correctness proposed in [41] which "ftas proved elusive to the authors" of |55j . So 
called 'lazy' SMT solvers, based on the integration of a SAT solver and a satisfiability 

^The Gauss elimination procedure for systems of linear equalities has been lifted to elementary theories 
in [3] and, since the theory of arrays is close to being Gaussian [TH], we show that 'Gaussian-like' steps can 
be exploited during completion phase. 
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procedures for conjunction of literals, seem to be able to easily implement the case- 
splitting required to derive a complete partition by resorting to the available SAT 
solver as explained, e.g., in [9]. 
(ii) To compare with the activity of solving equalities in [II], let us preliminarily observe 
that the logical equivalences in Figure [1] can be considered as rewrite rules (either from 
left to right or viceversa) that help us replace badly orientable equalities (recall the 
definition at the beginning of Section [3|) with equalities which are oriented from left 
to right. This is precisely how the equivalences in Figure [T] are used in the Gaussian 
completion procedure (of Section [5. 2p to eliminate critical pairs. Similarly, in order to 
provide one of the basic functionalities required by the Shostak combination frame- 
work, [H] designs a solver for equalities involving wr operations. For example, the 
procedure in [H] allows one to solve the equality a = wr{b, i, e) for b. We can adapt 
our procedure (in particular, by using the equivalences Symm and Refi of Figure [TJ 
to do the same. The main difference is that our normalization is done off-line, i.e. 
the signature is fixed since all terms appearing in the constraint are given, while the 
procedure in [H] must be on-line since is to be integrated in a Shostak combination 
algorithm which requires that to process equalities one at a time, as soon as they 
become available. Because of this, the completion algorithm can be simplified (since 
there is no need to compute intermediate normal forms) and standard techniques to 
show its termination can be used. In contrast, ^T] gives only a brief sketch of the 
termination of his procedure. For a more comprehensive comparison of on-line and 
off-line completion algorithms revisiting the Shostak congruence closure algorithm, the 
reader is pointed to [6|[36]. 
The procedure in [55j share with [5T] and ours the key activity of solving equalities. The 
main difference is that no canonical forms for terms or constraints are defined in [55] ; rather 
a special form of equality over arrays is introduced, called partial equality, which compares 
the content of two arrays only at a (finite) set of indexes. Formally, this is defined as 
follows: a =/ 6 iff for every index i not in the set /, the content of a at i is equal to that 
of b at the same index /. Thus, an equality of the form wr{a, i,e) = b can be rewritten as 
a ={i} b A rd{b, i) = e. The key insight of |55j is that it is possible to eliminate all lor's, 
so that arrays can be considered as uninterpreted functions and rd as function application, 
and a slightly modified congruence closure (to cope with partial equality) can be used to 
check satisfiability. While no standard rewriting techniques are used in [55], it is interesting 
to notice that two arrays a and b are cardinality dependent iff there exists a finite set / 
of indexes such that a =/ b. We do not introduce a new predicate symbol and use it 
in designing a satisfiability procedure, however we nevertheless exploit this notion and its 
preservation through embeddings (see Lemma [3. ip during our semantic interpolation proofs. 

7.2. Interpolation. After McMillan's seminal work on interpolation for model checking [451 
HE], several papers [IIl[2ll[22l[Ml[371[39lll2lll6l[52l[Ml[56] appeared whose aim was to design 
techniques for the efficient computation of interpolants in first-order theories of interest 
for verification, mainly uninterpreted function symbols, fragments of Linear Arithmetic, or 
their combination. An interpolating theorem prover is described in [37], where a sequent- 
like calculus is used to derive interpolants from proofs in prepositional logic, equality with 
uninterpreted functions, linear rational arithmetic, and their combinations. The method 
described in |56] proposes a framework suitable for lazy SMT-solvers, in which the theory 
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solver is required to derive partial interpolants for each theory lemmata it produces. The 
global interpolant can then be computed at the propositional level. The paper also illus- 
trates a method to derive interpolants in a Nelson-Oppen combination procedure, under 
certain restrictions on the theories to combine. More recently, in [22] the ideas of [56] are 
adapted to cope with state-of-the-art SMT-solving strategies for combinations of the theo- 
ries of uninterpreted functions and a fragment of Linear Arithmetic (called difference logic) . 
In [37], a method to compute interpolants in data structures theories, such as sets and arrays 
(with extensionality) , by axiom instantiation and interpolant computation in the theory of 
uninterpreted functions is described. It is also shown that the theory of arrays with exten- 
sionality does not admit quantifier-free interpolation. The "split" prover in [M] applies a 
sequent calculus for the synthesis of interpolants along the lines of that in [47j and is tuned 
for predicate abstraction [53]. In particular, the method is shown to be complete in the 
sense that the computed interpolants are guaranteed to provide the "right" level abstraction 
to prove a certain property, if one exists. The "split" prover can handle a combination of 
theories among which also the theory of arrays without extensionality is considered. In |34j . 
it is pointed out that the theory of arrays poses serious problems in deriving quantifier-free 
interpolants because it entails an infinite set of quantifier-free formulae, which is indeed 
problematic when interpolants are to be used for predicate abstraction. To overcome the 
problem, |34j suggests to constrain array valued terms to occur in equalities of the form 
a = wr(a,I,E) in the notation of this paper. It is observed that this corresponds to the 
way in which arrays are used in imperative programs. Further limitations are imposed on 
the symbols in the equalities in order to obtain a complete predicate abstraction procedure. 
In [35], the method described in [34] is specialized to apply CEGAR techniques [23] for 
the verification of properties of programs manipulating arrays. The method of |34j is ex- 
tended to cope with range predicates which allow one to describe unbounded array segments 
which permit to formalize typical programming idioms of arrays, yielding property-sensitive 
abstractions. In [54], it is shown how to extend satisfiability procedures based on axiom 
instantiation to compute interpolants. However, the theory of arrays is not considered. 
In [52], the approach of [54] is specialized to compute interpolants in the combination of 
Linear Rational Arithmetic and the theory of uninterpreted function symbols; again, the 
theory of arrays is not considered. A method for deriving interpolants in the theory of 
equality with uninterpreted functions is also given in [28] by extending a congruence clo- 
sure algorithm. In [39], a method to derive quantified invariants for programs manipulating 
arrays and integer variables is described. A resolution-based prover is used to handle an 
ad hoc axiomatization of arrays by using predicates. Neither McCarthy's theory of arrays 
nor one of its extensions are considered in [39]. The invariant synthesis method is based on 
the computation of interpolants derived from the proofs of the resolution-based prover and 
constraint solving techniques to handle the arithmetic part of the problem. The resulting 
interpolants may contain even alternation of quantifiers. 

Latest research on interpolating procedures has been focusing on (extensions of) Linear 
Integer Arithmetic. An interpolating procedure for linear Diophantine equalities is outlined 
in [33| . A procedure for full Linear Integer Arithmetic based on a sequent calculus can 
be found in jllj . In [12], the procedure in [11] is extended to cope with the theory of 
arrays without extensionality by axiom instantiation and interpolation in the combination of 
Presburger Arithmetic and uninterpreted function. Quantifiers can occur in the interpolants 
returned by the procedure. Recently [16], we have proposed a quantifier-free interpolation 
solver for AX^iff when combined with integer difference logic over indexes. 
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7.3. Conclusions and Future Work. We believe that the procedure proposed in this 
paper is a significant step forward to make model-checking more widely applicable to pro- 
grams whose properties depend crucially on the manipulations of arrays. To the best of 
our knowledge, in fact, our interpolation procedure is the first to compute quantifier-free 
interpolants for a natural variant of the theory of arrays with extensionality obtained by 
replacing the extensionality axiom with its Skolemization. This variant is 'natural' in the 
sense that it is sufficient to detect unsatisfiability of formulae as it is usually the case in 
standard model checking methods for infinite state systems. 

Despite the work reported in this paper is a significant step forward in widening the 
scope of applicability of interpolation in model checking of array manipulating programs, 
we discuss some interesting directions for further work. 

The implementation of the interpolating procedure proposed here is crucial for showing 
the practical viability of our approach. In this respect, the first step is to implement the 
satisfiability solver in Section [5j Recall that this requires guessing, a pre-processing phase, 
and Gaussian completion phase. Guessing, as already observed in Section 17.11 item (i) 
when discussing the relationship with the solver of |41j . can be implemented by adapting 
the mechanism to handle arrangements when combining satisfiability procedures in the 
Delayed Theory Combination approach of [9]. The main advantage of this approach is 
to use state-of-the-art SAT techniques to efficiently enumerate all possible partitions of 
indexes. The pre-processing phase can be implemented by using the data structures and 
basic expression manipulating procedures available in many state-of-the-art SMT solvers. 
The Gaussian completion phase requires more effort but it can adapt and reuse well-known 
techniques developed in rewriting for completion procedures (see, e.g., [4]). The second 
step to build the interpolating procedure of Section [6] is to implement the interpolating 
metarules of Table [TJ This is relatively simple and does not require much ingenuity and can 
be done on top of the existing infrastructure for proof generation that is available in many 
state-of-the-art SMT solvers. 

We are currently developing an implementation of the procedure presented here in the 
SMT-solver OpenSMT |18j . Preliminary experiments are encouraging although a more ex- 
tensive experimental evaluation is needed. In fact, it is well-known that the convergence 
of interpolation based model checking procedures crucially depends on the "quality" of the 
computed interpolants. There have been attempts (see, e.g., |34y49j) to build interpolating 
procedures that return "high quality" interpolants that guarantee the convergence of model 
checking for valid properties. Recently, it has been observed |261I44] that a certain degree 
of flexibility for tuning the computation of interpolants in interpolation procedures would 
be desirable to facilate their integration in model checking. In this respect, it would be 
particularly interesting to investigate how the order in which the interpolating metarules of 
Table[T]are applied, particularly those on Ai?-common terms, may influence the "quality" of 
the interpolants. An interesting alternative to investigate the flexibility of generating inter- 
polants (suggested in j44j ) would be to use the procedure presented here in the framework 
for computing quantified interpolants of |44j . 

Finally, there are two more interesting points that deserve further investigations. First, 
it would be interesting to study the size of the interpolating metarules refutations and 
compare them with interpolating procedures based on a proof calculus. The preliminary 
experiments with our implementation of the procedure in Open SMT show that our refuta- 
tions are quite compact but a more systematic comparison with available procedures based 
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on a proof calculus, e.g., [47] is needed to clarify this issue. Second, since in model check- 
ing it is useful to compute interpolants for several partitions of the same (unsatisfiable) 
formula, it would be interesting to design a method that permit the partial reuse of the 
interpolants returned for a partition to compute the interpolant for the next one so as to 
permit reuse and avoid degradation of performances due to partial recomputation of parts 
of interpolating metarules refutation. In this respect, it seems possible to adapt techniques 
developed for computing chains of interpolants in |13j . 
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Appendix A. Proof of Theorem 12.31 



Theorem 12.31 [7] Let T be universal. Then, T admits quantifier-free interpolation iff T 
has the amalgamation property. 

Proof. Suppose first that T has amalgamation] let A, B be quantifier- free formulae such that 
A /\ B \s not T-satisfiable. Let us replace variables with free constants in A, B; let us call 
S"^ the signature S of T expanded with the free constants from A and T,^ the signature S 
expanded with the free constants from B (we put T,'-" = S"^ fl S^). For reductio, suppose 
that there is no ground formula C such that: (a) A T-entails C; (b) C AB is T-unsatisfiable; 
(c) only free constants from T,^ occur in C. 

As a first step, we build a maximal T-consistent set F of ground S"^-formulae and 
a maximal T-consistent set A of ground S^-formulae such that vl € F, i? G A, and 
For simplicit}0 let us assume that S is at most countable, so that we 
can fix two enumerations 

Ai,A2,... Bi,B2,... 

of ground T,^- and S^-formulae, respectively. We build inductively F„,A„ such that for 
every n (i) F„ contains either An or ^A^, (ii) A„ contains either B^ or ^B^', (iii) there is 
no ground S'-'-formula C such that F„ U {~'C} and A„ U {C} are not T-consistent. Once 
this is done, we can get our F, A as F := |J F„ and A := |J A„. 

We let Fo be {A} and Aq be {B} (notice that (iii) holds by (a)-(b)-(c) above). To build 
F„_|_i we have two possibilities, namely F„ U {An} and F„ U {^An}. Suppose they are both 
unsuitable because there are Ci, C2 G Tj^ such that the sets 

rnU{An,^Ci}, A„U{C7i}, rnU{^An,^C2}, A„ U {C2} 

are all T-inconsistent. If we put C := Ci V C2, we get that F.„ U {~'C} and A„ U {C} are 
not T-consistent, contrary to induction hypothesis. A similar argument shows that we can 
also build A„. 

Let now A4i be a model of F and A42 be a model of A. Consider the substructures 
M\,M2 of M.i,M.2 generated by the interpretations of the constants from S*-^: since the 
related diagrams are the same (because FnS*^ = AnE*^), we have that A/i and A2 are Sc- 
isomorphic. Up to renaming, we can suppose that Mi and M2 are just the same substructure 
(let us call it J\f for short). Since the theory T is universal and truth of universal sentences is 
preserved by substructures, we have that Af is a model of T. By the amalgamation property, 
there is a T- amalgam Ai oi Mi and 7W2 over Af. Now A,B are ground formulae true in 
AAi and AA2, respectively, hence they are both true in M., which is impossible because 
A A B was assumed to be T-inconsistent. 

Suppose now that T has quantifier free interpolants. Take two models Mi = (Mi,Xi) 
and M2 = (M2,T2) of T sharing a substructure N = {N,J'). In order to show that a 
T-amalgam of 7Wi,7W2 over Af exists, it is sufficient (by Robinson Diagram Lemma l2.2p to 
show that (5_A4i(A/i) U 6m2{^^2) is T-consistent. If it is not, by the compactness theorem of 
first order logic, there exist a S U Mi-ground sentence A and a S U M2-ground sentence B 

^^By abuse, we use E"^ to indicate not only the signature E'"' but also the set of formulae in the signature 
E^. 

^^This is just to avoid a (straightforward indeed) transfinite induction argument. 
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such that (i) AaB is T-inconsistent; (ii) A is a conjunction of hterals from (5x^(Mi); (iii) B 
is a conjunction of hterals from (^a^jI^^s)- By the existence of quantifier-free interpolants, 
taking free constants instead of variables, we get that there exists a ground E U A''-sentence 
C such that A T-entails C and B A C is T-inconsistent. The former fact yields that C is 
true in M.i and hence also in M and in A42, because C is ground. However, the fact that 
C is true in Ai2 contradicts the fact that B AC is T-inconsistent. □ 
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